ArticlePDF Available

Evaluation of Operational Safety Assessment (OSA) Metrics for Automated Vehicles in Simulation

April 2021
SAE Technical Papers

April 2021

Authors:

Arizona State University

The operational safety of automated driving system (ADS)-equipped vehicles (AVs) must be quantified using well-defined metrics in order to gain an unambiguous understanding of the level of risk associated with AV deployment on public roads. In this research, efforts to evaluate the operational safety assessment (OSA) metrics introduced in prior work by the Institute of Automated Mobility (IAM) are described. An initial validation of the proposed set of OSA metrics involved using the open-source simulation software Car Learning to Act (CARLA) and Scenario Runner, which are used to place a subject vehicle in selected scenarios and obtain measurements for the various relevant OSA metrics. Car following scenarios were selected from the list of 37 pre-crash scenarios identified by the National Highway Traffic Safety Administration (NHTSA) as the most common driving situations that lead to crash events involving two light-duty vehicles. The resulting data were used to evaluate different parameters and thresholds of the metrics developed in the prior IAM work. The simulation and analysis results were used to evaluate the relevant metrics in the context of a proposed criteria as measurable and applicable to the operational safety of AVs and human-driven vehicles alike in a data-driven approach.

Minimum Safe Distance calculations boxplot with varying parameter sets

…

MSDV duration boxplot with varying parameter sets

…

Time-based metrics calculation boxplot with varying threshold values

…

Lead Vehicle Decelerating metrics violations

…

for OSA metrics

…

Figures - uploaded by Jeffrey Wishart

Content may be subject to copyright.

Content uploaded by Jeffrey Wishart

Content may be subject to copyright.

2021-01-0868 Published 06 Apr 2021

Evaluation of Operational Safety Assessment

(OSA) Metrics for Automated Vehicles in

Simulation

Maria Soledad Elli Intel Corp.

Jerey Wishart Exponent Inc.

Steven Como and Siddhaarthan Dhakshinamoorthy Arizona State University

Jack Weast Intel Corp.

Citation: Elli, M.S., Wishart, J., Como, S., Dhakshinamoorthy, S. et al., “Evaluation of Operational Safety Assessment (OSA) Metrics for

Automated Vehicles in Simulation,” SAE Technical Paper 2021-01-0868, 2021, doi:10.4271/2021-01-0868.

Abstract

The operational safety of automated driving system

(ADS)-equipped vehicles (AVs) must bequantied

using well-dened metrics in order to gain an unam-

biguous understanding of the level of risk associated with AV

deployment on public roads. In this research, efforts to

evaluate the operational safety assessment (OSA) metrics

introduced in prior work by the Institute of Automated

Mobility (IAM) are described. An initial validation of the

proposed set of OSA metrics involved using the open-source

simulation software Car Learning to Act (CARLA) and

Scenario Runner, which are used to place a subject vehicle in

selected scenarios and obtain measurements for the various

relevant OSA metrics. Car following scenarios were selected

from the list of 37 pre-crash scenarios identified by the

National Highway Trac Safety Administration (NHTSA) as

the most common driving situations that lead to crash events

involving two light vehicles. e resulting data were used to

evaluate dierent parameters and thresholds of the metrics

developed in the prior IAM work. e simulation and analysis

results were used to evaluate the relevant metrics in the

context of a proposed criteria as measurable and applicable

to the operational safety of AVs and human-driven vehicles

alike in a data-driven approach.

Introduction

As the development of automated driving system

(ADS)-equipped vehicles (AVs) continues, the need

of a process to evaluate the operational safety perfor-

mance of the technology has become ever more apparent. e

process must provide a consistent, unbiased and technology-

neutral evaluation that will provide public condence as AVs

are deployed.

A possible process for this operational safety performance

evaluation is to use the concept of a formalized Safety Case

Framework (SCF). A safety case is “a structured argument,

supported by a body of evidence, that provides a compelling,

comprehensible, and valid case that a product is safe for a given

application in a given environment.” [1] An SCF, an example

of which is the UL 4600 standard [2], will contain a variety of

possible verication and validation (V&V) methods that are

used in developing the required evidence to support the AV

safety case. A subset of V&V methods is testing methods,

which include conducting simulation testing, closed course

testing, public road testing, or some combination of the three

types that involves placing the AV under test in a set of trac

scenarios and evaluating the operational safety performance.

A comprehensive evaluation methodology, with validated

metrics, must bedeveloped to derive safety case evidence from

the test conduct of a given scenario; to the authors’ k nowledge,

such a methodology does not exist in the literature.

e Institute of Automated Mobility (IAM) was formed

by the Governor’s Executive Order in 2018 to help provide

guidance on AVs in the state of Arizona. e IAM has been

conducting research to develop an operational safety assess-

ment (OSA) methodology, along with OSA metrics, to beused

in an SCF as the evaluation methodology for trac scenario

testing. e intent of the OSA methodology is to evaluate the

navigation of a given trac scenario by a vehicle (primarily

AVs but can also beused for human-driven vehicles) using

the OSA metrics measurements and assigning a score for said

test. e aggregate score over the set of trac scenarios can

then beused to build the safety case for the AV, which in turn

can beused by AV developers and authorities having jurisdic-

tion (AHJs) to evaluate the status of an AV throughout its

development and allow for a determination of readiness for

various stages of deployment. e safety case provides assur-

ance of a level of safety achieved by an AV, which is imperative

for gaining public trust.

Downloaded from SAE International by Jeffrey Wishart, Monday, April 05, 2021

EVALUATION OF OPERATIONAL SAFETY ASSESSMENT (OSA) METRICS FOR AUTOMATED VEHICLES IN SIMULATION 2

In 2020, IAM researchers proposed a set of OSA metrics

in Wishart etal.[3] that were compiled and adapted from a

comprehensive literature review. e OSA metrics set is, to

the authors knowledge, the only comprehensive set that can

beused to measure operational safety of AVs and human-

driven vehicles in the literature.

In the current work, a subset of the OSA metrics proposed

in [1] has been measured and evaluated for a selection of trac

scenarios to rene where appropriate. is evaluation has been

conducted using simulation soware, including Car Learning

to Act (CARLA) [4] and ScenarioRunner [5], to determine the

OSA metric measurements for a subset of scenarios chosen

from NHTSA’s list of 37 pre-crash scenarios in [6].

e proposed set of OSA metrics included sets of param-

eters and thresholds which were adapted from existing litera-

ture and other research studies. Many of these parameters

and thresholds are considered subjective and require further

research for renement. e analysis conducted in this work

includes an attempt to determine values for parameters and

thresholds for some of the metrics that were le as future work

in [3]. e simulation results presented in this work provide

further insight into the consequences of diering thresholds

and parameters for a variety of car-following scenarios, as

well as the the performance of the proposed OSA metrics in

the context of assessing the operational safety of AVs.

e outline of the paper is as follows. First, the OSA

metrics are described and summarized. e parameters and

thresholds used in the experiments are then listed. e trac

scenarios, including the selection process, are then described.

The simulation methodology is discussed, including the

CARLA soware and test vehicle model. Next, the simulation

results are presented and discussed. Finally, overall conclu-

sions and future work are described.

OSA Metrics

e proposed set of OSA metrics was introduced in Wishart

etal. [3] e objective was to develop a comprehensive set that

would allow for an assessment of the operational safety of a

vehicle (human-driven or automated driving) in a variety of

scenarios as part of an SCF. A novel taxonomy is proposed

here to organize the OSA metrics into three categories: (1)

Black Box metrics, (2) Grey Box metrics, and (3) White

Box metrics:

•A Black Box metric allows for measurement of data that

can beobtained without requiring any access to ADS

data. is could befrom an on-board or o-board source

(e.g., public road infrastructure, or CAN bus data).

However, using ADS data may enhance the accuracy and

precision of the measurement(s).

•A Grey Box metric allows for measurement of data that

can only beobtained with limited access to ADS data.

•A White Box metric allows for measurement of data that

can only beobtained with signicant access to ADS data.

ere are trade-os for each metric type that makes them

advantageous (or disadvantageous) in particular use cases.

For example, the Black Box metrics may bepreferable where

access to proprietary ADS data is not desired. Conversely,

White Box metrics allow for specic sub-systems in the ADS

to beassessed rather than just the AV system as a whole. e

Grey Box metrics represent a balance between sensitivity to

proprietary data and assessment granularity. It should

benoted that while Black Box metrics are useful for both

human-driven vehicles and AVs, Grey Box and White Box

metrics are only applicable to AVs.

e proposed OSA metrics are shown in Table 1, along

with the proposed taxonomy. is proposed set is comprised

only of Black Box and Grey Box metrics since White Box

metrics rely on shared data from AV developers and may

beunavailable for evaluation purposes. e Black Box metrics

can be further categorized as Minimum Safe Distance-

Related, Universal, and Trac Engineering-Related.

e Minimum Safe Distance-Related metrics are based

on the Responsibility-Sensitive Safety (RSS) model [8].

Universal metrics are dened as those which apply to both

human-driven vehicles and AVs including events such as

Collision Incidents and Trac Law Violations. It should

benoted that the latter metric in [3] was originally “Rules-

of-the-Road Violation” but has since been changed to “Trac

Law Violation” since the denition for Rules of the Road is

ambiguous and can include customary practices for a partic-

ular region. Trac engineering metrics consist of tradition-

ally used surrogate safety metrics that have been heavily

researched in the past [9]. Lastly, the Grey Box metrics

include two metrics (ADS Active (ADSA) and Achieved

Behavioral Competency (ABC)) that indicate whether the

ADS is completing the dynamic driving task (DDT) and if

the AV accomplishes the trajectory as planned, respectively.

e Human Trac Control Detection Error Rate (HTCDER)

TAB L E 1 Taxonomy for OSA metrics

Black Box Metrics

Grey Box MetricsMinimum Safe Distance-Related Universal Trac Engineering-Related

Minimum Safe Distance Violation

(MSDV)

Collision Incident (CI) Time-to-Collision Violation

(TTCV)

Human Trac Control Detection

Error Rate (HTCDER)

Proper Response Action (PRA) Trac Law Violation (TLV) Modiﬁed Time-to-Collision

Violation (MTTCV)

ADS Active (ADSA)

Minimum Safe Distance Factor

(MSDF)

Human Trac Control Violation

Rate (HTCVR)

Post-Encroachment Time

Violation (PETV)

Achieved Behavioral Competency

(ABC)

Aggressive Driving (AD) Minimum Safe Distance Calculation

Error (MSDCE)

Downloaded from SAE International by Jeffrey Wishart, Monday, April 05, 2021

EVALUATION OF OPERATIONAL SAFETY ASSESSMENT (OSA) METRICS FOR AUTOMATED VEHICLES IN SIMULATION 3

and Minimum Safe Distance Calculation Error (MSDCE)

provide insight into the perception system performance

without requiring raw sensor data (which is particularly

sensitive, proprietary data) but rather much more limited

ADS data. While the MSDCE metric is also based on the

RSS model, it is classied as a “Grey Box” metric due to the

necessity of some ADS data required to calculate the metric

value.

It should benoted that the authors continue to monitor

the literature for additional metrics to beconsidered for the

OSA metrics, such as Time Headway (THW) from [10],

Collision Avoidance Capability (CAC) from [11], and Model

Predictive Instantenous Safety Metric (MPrISM) from [12].

Future work will consider these metrics to beincluded in the

OSA metrics set. e OSA metrics from Table 1 are also being

considered for the SAE J3237 Information Report currently

being developed by the V&V Task Force of the SAE On-Road

Automated Driving (ORAD) Committee.

Selection and Formulation

of OSA Metrics

For this work, not all of the proposed OSA metrics are measur-

able when using the employed simulation. Within the CARLA

simulator, ground truth data of the vehicles’ positions, speeds,

and accelerations were used to calculate the aforementioned

metrics; therefore, metrics related to quantication of localiza-

tion and tracking errors, such as MSDCE and HTCDER, are

not possible to obtain. Additionally, metrics related to the

behavior of the subject vehicle under test (i.e., the AV), such

as ABC, HTCVR, ADSA, and PRA, are inapplicable to this

work as the algorithm controlling the vehicles’ behavior does

not reect a real AV driving policy.

erefore, the OSA metrics evaluated in the presented

work are:

•MSDV

•TTCV

•MTTCV

•PETV

In addition to the previously discussed metrics, the THW

metric discussed in [10] and [13] was also included in the

analysis. THW is a time-based metric similar to that of TTC;

however, the THW is based on only the distance between the

following and lead vehicle in relation to the speed of the

following vehicle rather than the dierence in speed. Similar

to other time-based metrics, the lower the THW value, the

higher the risk of a collision; therefore, THW is oen used in

the literature with a pre-determined threshold. is implies

that when the threshold is met, the situation has become

unsafe and a proper response action is required. An example

of the usage of this metric is the latest United Nations

Economic Forum for Europe (UNECE) regulation on

Automated Lane-Keeping Systems (ALKS) [10]. e THW

formulation was modied to align with the other metrics in

the form of a violation if the threshold is exceeded, such that

the Time Headway Violation (THWV) is introduced.

e details of the selected metrics are shown in Tab le 2

(note that the Distance to Stop Violation (DSV) is discussed

below). e selected metrics are relevant to the analysis of

operational safety and represent the rst step in the evaluation

of the proposed OSA metrics.

In order to evaluate the performance of the chosen OSA

metrics, three evaluation criteria were developed to analyze

the ecacy of a given metric in assessing the safety of a

driving situation:

•Robustness to changing scenario congurations

•Relevance

•Comparison to a ground truth metric

Robustness to changing scenario congurations refers to

the metrics providing a timely warning that changes appro-

priately with variations for dierent scenario conditions such

as vehicle initial position and speed, relative headway of the

vehicles, changes in the environment, etc. For example, if the

scenario conguration changes but the metrics violation

timing does not, then the metric robustness is lower.

Relevance refers to the metric providing safety informa-

tion throughout the scenario for all scenario permutations.

For example, if there is one (or more) instance(s) of the metric

exhibiting a nonsensical value (e.g., a denominator being 0),

the metric relevance is lower.

For the purpose of this work, a ground truth metric,

Distance to Stop Violation (DSV) has been established to

enable comparisons of the eectiveness of metrics in identi-

fying a potentially unsafe situation preemptively. is crite-

rion thus evaluates the timeliness of the metric violation

temporal occurrence. e ground truth DSV metric is based

on the distance that it would take the subject vehicle to come

to a full stop at a determined deceleration (DSTOP in

Equation 6in Table 2). When the follower vehicle is driving

at a distance that is less or equal than the distance required

for this deceleration to occur, a DSV has occurred. is

metric indicates a potentially unsafe situation under ideal

conditions, meaning that this metric does not consider the

velocity or deceleration capabilities of the lead vehicle (or

rather it assumes a stopped lead vehicle) nor the road condi-

tions. It also assumes that the subject vehicle can reach the

dened abrakeinstantly. Because of this, two dierent values

for abrake were selected and evaluated with simulation in order

to provide dierent “thresholds” for comparison with the

ground truth. ese values were taken from [10] and reect

the minimum deceleration needed to perform an emergency

maneuver (abrake=5 m/s2) and a reasonable (maximum) decel-

eration applied by automatic emergency braking (AEB)

systems (abrake=8.3 m/s2). e implementation of this evalu-

ation criterion is described in the Metrics Observations and

Discussion section.

ese three criteria will beused to evaluate the metrics

based on the experimental results in the following sections.

Downloaded from SAE International by Jeffrey Wishart, Monday, April 05, 2021

EVALUATION OF OPERATIONAL SAFETY ASSESSMENT (OSA) METRICS FOR AUTOMATED VEHICLES IN SIMULATION 4

OSA Metrics Parameterization

e OSA metrics evaluated in this work included subjective

assumptions for thresholds determining when a violation of

a metric occurred. erefore, this work focused on evaluating

the impact of dierent threshold and parameter values used

to dene the OSA metrics.

Time-Based Metrics e implementation of time-based

metrics such as TTCV, MTTCV, PETV, and THWV is highly

dependent on the threshold values assigned to them and there-

fore it is important to evaluate the metrics’ performance as a

function of diering thresholds. e set of values chosen for

TTCV, MTTCV, and THWV thresholds are based on values

suggested by previous literature reviewed in [3]. In the case

of PETV, the threshold values were chosen according to the

TAB L E 2 OSA metrics formulation

Minimum Safe Distance Violation ′



<∧ <



=





min min

lat lat long long

if d d d d

MSDV else

= ∧



=



1 '1

if MSDV Originated by AV

MSDV

else

(1)

Time to Collision Violation

TTC vv

−

≤



=



if TTC threshold

TTCV

else

(2)

Modiﬁed Time to Collision Violation −∆ ± ∆ + ∆

∆

 

 



22V V AD

MTTC A

≤



=



if MTTC threshold

MTTCV else

(3)

Post Encroachment Time Violation PET=t2−t1

≤



=



if PET threshold

PETV else (4)

Time Headway Violation

THW v

−

≤



=



if THW threshold

THWV else

(5)

Distance to Stop Violation

brake

DSTOP a



≤



=





long

if d DSTOP

DSV else

(6)

Where:

dlong: longitudinal distance between two vehicles

dlat: lateral distance between two vehicles

min :

long

minimum longitudinal distance between two vehicles ([8])*

min

lat

: minimum lateral distance between two vehicles ( [8])*

XL: Leading vehicle position

XF: Following vehicle position

vL: Leading vehicle speed

vF: Following vehicle speed

∆

: Relative velocity

∆

: Relative acceleration



: Relative space gap (equivalent to

dlong in car following situations)

t2: Arrival time of (any part of) Vehicle 2 at Conﬂict Point

t1: Arrival time of (any part of) Vehicle 1 at Conﬂict Point

abrake: Following vehicle deceleration

*Note: The formulae for

min

long

and

min

lat

are in Appendix A (equations 7 and 8).

Downloaded from SAE International by Jeffrey Wishart, Monday, April 05, 2021

EVALUATION OF OPERATIONAL SAFETY ASSESSMENT (OSA) METRICS FOR AUTOMATED VEHICLES IN SIMULATION 5

distribution of PET values found in the simulated data, as the

spread of PET values was smaller than that of the rest of the

time-based metrics (see Figure 7). e threshold values for

the time-based metrics evaluated in this work are shown in

Tab le 3.

Minimum Safe Distance-Related Metric For the

Minimum Safe Distance-Related metric, t here are four param-

eters that were examined according to the denitions in [8] :

1. Reaction time of the ADS of the subject vehicle, ρ

2. Maximum acceleration of the subject vehicle during

the response duration,

aaccel

long

max,

3. Minimum deceleration of the subject vehicle aer the

response duration in order to avoid a collision,

adecel

long

min,

4. Maximum assumed deceleration capability of the

other vehicle,

adecel

long

max,

e values chosen for the above-mentioned parameters

were extracted from previous research that determined such

values based on naturalistic driving data and further simula-

tion experiments. For the purpose of this work, the set of

values were separated into three dierent categories (shown

in Table 4): Aggressive, Conservative, and NDS (with NDS

being Naturalistic Driving Study). The values for the

Aggressive and Conservative categories were adopted from

[14] in which a Falsication Search Engine using simulation

was used to nd RSS parameter values with an associated

robustness value, with robustness being a measure of how

close the vehicles under test were during the simulated

scenarios. From the search, clusters of parameter sets were

divided into Aggressive and Conservative categories as they

resulted in more aggressive (i.e., shorter following distances)

and more conservative (i.e., larger following distances)

behaviors, respectively. e values under the NDS category

were adopted from the China Intelligent Transportation

Systems (C-ITS) Alliance standard #0116-2019 [15]. e

values in this standard were dened aer analyzing 3 years

of naturalistic driving data collected from Shanghai highways

in China.

Experimental Design

e experimental design in this work involved developing

various scenarios and then implementing said scenarios in

simulation, as described in the following sections.

Scenario Selection

One of the difficult challenges currently facing the AV

industry is the selection of scenarios to beevaluated for the

assessment of vehicle safety. Unique scenario generation is

out of the scope of this project and the authors relied upon

the 37 pre-crash scenarios documented by the National

Highway Trac Safety Administration (NHTSA) [6]. From

the list of pre-crash scenarios, a subset of car-following situ-

ations was selected for evaluation. e 37 pre-crash scenarios

were filtered and car-following scenarios that involved

“Two-Vehicle”, “Light-Vehicle Crashes” were then selected

based on frequency of occurrence. Scenarios involving signal-

ized and unsignalized junctions are out of the scope of this

work. e reason for using car-following scenarios in the

subject work was for simplication within the context of the

simulation setup. Future work will expand the discussed

methodology to consider more complex scenarios such as

intersection-related environments. Details of the chosen

scenarios extracted from [6] are summarized in Table 7 in

Appendix B and include:

1. Lead vehicle stopped (LVS)

2. Lead vehicle decelerating (LVD)

3. Lead vehicle moving at lower constant

speed (LVMLCS)

4. Lead vehicle accelerating (LVA)

Together, the four scenarios accounted for some 18.4% of

all light-duty vehicle crashes from 2004-2008, according to

Tab le 7 in [6] .

Scenarios Realization

In this work, each selected scenario from NHTSA’s pre-crash

topology is considered a scenario category in which all

scenarios dened within the same category share the same

behavior, but variations on the vehicles’ speeds and/or initial

position were dened to evaluate the sensitivity to changes in

such conditions. Speed variations and positions for the

vehicles were dened with the goal to create relevant situations

that may aect the resulting OSA metrics calculation. is

allows for the inference of data trends when calculating the

associated OSA metrics. In total, 13 dierent scenarios were

dened, shown in Table 5 that were simulated and then

analyzed. e scenarios dened in this study are not meant

to beexhaustive of dierent driving situations, but rather an

initial selection for studying the impact of dierent driving

conditions of vehicles (e.g., vL > vF, vL = vF, etc.) in the

OSA metrics.

In the simulation experiments, the selected scenarios

involve only two vehicles, the subject vehicle and the lead

TAB L E 3 Evaluated thresholds for time-based metrics

Metric Thresholds [s]

TTCV {1, 2, 3, 4, 5}

MTTCV {1, 2, 3, 4, 5}

PETV {0.5, 1, 1.2, 1.5, 2}

THWV {1, 2, 3, 4, 5}

TAB L E 4 Categories of RSS parameters for MSDV metric

Category ρ [s]

long

max,accel

[m/s2]

long

min,decel

[m/s2]

long

max,decel

[m/s2]

Aggressive 0.5 4.1 4.6 8

Conservative 1.9 5.9 4.1 9.5

NDS 0.2 1.8 3.6 6.1

Downloaded from SAE International by Jeffrey Wishart, Monday, April 05, 2021

EVALUATION OF OPERATIONAL SAFETY ASSESSMENT (OSA) METRICS FOR AUTOMATED VEHICLES IN SIMULATION 6

vehicle. Both vehicles start from a rest position and accelerate

to their target speeds unless explicitly stated otherwise. e

two vehicles are positioned with the subject vehicle behind

the lead vehicle in the same lane at a specied distance, in a

straight road with no breaks (e.g., junctions, signals, turns,

stop signs, etc.) which is long enough for both vehicles to

achieve their designated target speeds. Each scenario

progresses such that the subject vehicle approaches the lead

vehicle and a collision occurs. e scenario is terminated at

the collision time.

e behavior of the vehicles in a scenario is described

as follows:

Lead Vehicle Stopped (LVS) e lead vehicle is posi-

tioned at an initial distance ahead of the subject vehicle to

provide enough distance for the subject vehicle to reach its

designated target speed. e subject vehicle maintains this

speed until it eventually collides with the lead vehicle.

roughout the entire scenario, the lead vehicle stays at rest.

Example vehicle dynamics for the LVS scenarios are depicted

in Figure 1.

Lead Vehicle Decelerating (LVD) e lead vehicle is

positioned at an initial distance ahead of the subject vehicle

and both vehicles start moving from rest position. Both

vehicles maintain a constant acceleration aer reaching the

target speed. Aer vehicles achieve their target speed, the lead

vehicle starts decelerating until reaching a full stop. e

subject vehicle eventually collides with the decelerating lead

vehicle. Example vehicle dynamics for the LVD scenarios are

depicted in Figure 2.

Lead Vehicle Moving at Lower Constant Speed

(LV M LC S ) The lead vehicle is positioned at an initial

distance ahead of the subject vehicle and both vehicles start

moving from rest position. e target speed of the lead vehicle

is lower than that of the subject vehicle. e subject vehicle

eventually collides with the slower-moving lead vehicle.

Example vehicle dynamics for the LVMLCS scenarios are

depicted in Figure 3.

Lead Vehicle Accelerating (LVA) e lead vehicle is

positioned at an initial distance ahead of the subject vehicle

and both vehicles start moving from rest position. e lead

vehicle has an initial speed that is lower than t hat of the subject

vehicle. As the subject vehicle approaches the lead vehicle, the

lead vehicle starts accelerating to its nal target speed. e

subject vehicle eventually collides into the accelerating lead

vehicle. Example vehicle dynamics for the LVA scenarios are

depicted in Figure 4.

TAB L E 5 Scenarios categories and details

Scenario Category Scenario ID

Initial Headway

Distance [m]

Subject Vehicle

Target Speed [m/s]

Lead Vehicle Target

Speed [m/s]

Lead Vehicle Stopped LVS_10 200 10 0

LVS_15 200 15 0

LVS_18 200 18 0

Lead Vehicle Decelerating LVD_14 5* 14.1 14

LVD_15 5* 15 15.1

LVD_16 30 16 18

LVD_18 30 18 18

LVD_20 30 20 18

Lead Vehicle Moving at Lower Constant Speed LVMLCS_12 30 12 10

LVMLCS_15 30 15 10

LVMLCS_20 30 20 15

Lead Vehicle Accelerating LVA_15 30 15 20

LVA_20 30 20 25

*Note: Scenarios LVD_14 and LVD_15 are special situations in which the subject vehicle is initially 5m away from leading vehicle, both driving at

high speeds, purposefully creating unsafe situations for the sake of metrics evaluation.

FIGURE 1 Example: LVS_18 scenario dynamics

FIGURE 2 LVD_16 scenario dynamics

Downloaded from SAE International by Jeffrey Wishart, Monday, April 05, 2021

EVALUATION OF OPERATIONAL SAFETY ASSESSMENT (OSA) METRICS FOR AUTOMATED VEHICLES IN SIMULATION 7

Simulation Setup

CARLA [4] is an open-source AV simulation soware which

can beused to simulate an AV and diverse sensor suites for

use in dierent weather conditions and road topologies.

ScenarioRunner for CARLA [5] is an open source trac

scenario denition and execution engine that can beused for

dening the interaction of an AV with other road users in

dierent driving situations. Within ScenarioRunner, it is

possible to dene scenarios using the standard OpenScenario

format [16] which allows for controlling involved trac partic-

ipants’ initial conditions, driving inputs, and environmental

conditions for evaluation purposes.

For the purpose of this work, a Tesla Model 3 vehicle from

the CARLA vehicle library was selected as the test vehicle.

Both the subject a nd leading vehicle within each scenario were

defined using the same vehicle model to avoid possible

confounding results associated with nuances of dierent

vehicle models. e particular vehicle used for the simulation

is not important so long as the vehicle characteristics are

representative of realistic conditions as the metrics are not

sensitive to a specic vehicle type.

The subject vehicle used in the scenarios behaved

according to a basic behavior dened by the “Roaming Agent”

in CARLA. is agent controls the longitudinal and lateral

motion of the vehicle using a PID controller (with parameters

for the longitudinal control as: Kp = 0.1, Kd = 0.1 and Ki = 1.0;

the parameters for lateral control are not relevant to this

work). Only longitudinal control is relevant in the car-

following scenarios evaluated in this work. Additionally, the

agent responds to trac lights and other vehicles within a

certain proximity threshold. In our experiments, this prox-

imity threshold was set to 0 meters in order to prevent inter-

ference with the subject vehicle’s behavior creating, in this

way, purposefully unsafe situations and collisions for the sake

of the metrics evaluation. As a result, the subject vehicle did

not apply any response when approaching the leading vehicle.

Within each scenario run, the data needed to compute

the metrics were collected and a post-processing Python

pipeline was generated to calculate the instantaneous measure

of the OSA metrics throughout each scenario execution.

In order to calculate the Minimum Safe Distance-Related

metrics from [3], the subject vehicle used the open source

implementation of the RSS model from [8] and [17] that is

integrated within CARLA [18]. An “RSS Sensor” was attached

to the subject vehicle which analyzed the situation at each

time step in order to calculate the longitudinal and lateral

minimum safe distances with respect to the other road users

during the simulation. Within the simulation setup, the RSS

Sensor was only used to detect dangerous situations (according

to RSS denitions), but the subject vehicle did not actuate

based on this information (i.e., did not apply a Proper Response

as dened by RSS), thus, resulting in a collision for each

scenario. By allowing collision events to occur, the capability

of the OSA metrics to pre-emptively determine potential colli-

sions could beassessed.

A quantitative analysis as well as a visual depiction of the

metrics was generated to better understand the relationship

between dierent metrics values and the impact that thresh-

olds or parameter values have on the context of assessing the

safety of the situation.

Experimental Results

e evaluation of the metrics’ relationship, redundancies,

areas of inapplicability, and any other potential observations

associated with the proposed metric thresholds and param-

eters is explained in the following sections.

Metric values as well as metric violation duration distri-

butions across all scenarios were evaluated in order to under-

stand the relevance of each metric for the assessment of the

operational safety performance of AVs. Particularly, metric

violation durations are important for assessing violation

temporal occurrence when identif ying unsafe situations. Short

metric violation durations could be indicative of failure to

identify t he safety of the situation. In ot her words, short metric

violation durations could indicate that the metric identied

an unsafe situation too late, or that it is not giving a continuous

safety assessment. On the other hand, large metric violation

durations could beindicative of increased conservativeness.

Additionally, metric violations for a given threshold/param-

eter set were also analyzed across the scenarios.

Metric Parametrization Based

on Duration Distributions

For each of the evaluated metrics, specic thresholds and, in

some cases, assumed parameters must beassigned in order

to establish a violation occurrence. In previous work, Wishart

et al. [3] compiled thresholds and assumed parameters

collected from previous research as proposed values for each

metric. is analysis considered a range of values for each

FIGURE 3 LVMCLS_15 scenario dynamics

FIGURE 4 LVA_20 scenario dynamics

Downloaded from SAE International by Jeffrey Wishart, Monday, April 05, 2021

EVALUATION OF OPERATIONAL SAFETY ASSESSMENT (OSA) METRICS FOR AUTOMATED VEHICLES IN SIMULATION 8

scenario to further assess the impact of these values on gener-

ating meaningful results. Comparing the results for each

threshold value demonstrated that for high thresholds values

on the time-based metrics, violations were generated well

before the vehicle would need to react to avoid an unsafe situ-

ation whereas for lower thresholds, an unsafe situation was

not detected until the vehicle would not have enough time to

safely avoid the collision. is evaluation provided additional

context to justify threshold values within the metric

violation evaluations.

Minimum Safe Distance

Violation (MSDV)

e Minimum Safe Distance Violation metric depends on the

current longitudinal and lateral distances between the subject

vehicle and the lead vehicle as well as the calculated

RSS-dened longitudinal and lateral minimum safe distances

using the dened parameters. In the time steps during which

the calculated minimum safe distances were less than the

instantaneous longitudinal and lateral distances, a violation

of the Minimum Safe Distance occurs (eq. (1)). In order to

better understand the impact of dierent parameter sets into

the MSDV metric, the distributions across the selected

scenarios of the resulting calculation of the Minimum Safe

Distance using different parameters sets are depicted in

Figure 5.

e results indicated that the Minimum Safe Distances

calculated with Ag gressive and NDS parameter sets lead to

similar distributions, with median longitudinal distances of

23.0m and 16.2m respectively. In contrast, the Conservative

category had a median longitudina l distance of 103.8 m, more

than four times the median of the Aggressive category.

When analyzing the distribution of the duration of a MSD

Violation with the dierent parameter sets, a more conserva-

tive set of parameters yielded longer violation durations, with

a median violation duration of 16.6 s (see Figure 6). In the case

of Aggressive and NDS parameters, the median durations of

an MSD Violation were 8.2 s and 5.4 s, respectively. e

minimum violation durations were 2.0 s, 1.6 s, and 7.4 s for

Aggressive, NDS and Conservative categories, respectively.

While Aggressive and NDS parameter sets generated

similar distributions of MSD values and violation durations,

it is worth noting the eect of choosing one set over the other.

In the case of the Aggressive category, the subject vehicle

assumes that it can accelerate up to amax, accel = 4.1 m/s2 during

the response time ρ = 0.5 s and then it is expected to brake

with at least amin, brake = 4.6 m/s2. Additionally, the subject

vehicle assumes that the front vehicle can brake with up to

amax, brake= 8.0 m/s2. ese values certainly reect a more

aggressive and jerky behavior than the values from the NDS

category. e primary reason for this is because values on the

NDS category were extracted from human drivers’ behavior,

while the ones on the Aggressive category from simulation

experiments. Moreover, the values in the NDS category are in

line with previous research work done by NHTSA around the

average maximum deceleration values achieved by humans

[19]. In the latter work by NHTSA, it was found that the mean

maximum deceleration applied by humans on dry and wet

surfaces was approximately 0.67 g (SD 0.25, max 1.15, min

0.00 4).

Time-Based Metrics In this section, the time-based

metrics TTCV, MTTCV, PETV and THWV were analyzed,

in addition to the impact of diering thresholds for the initia-

tion of a metric violation and the duration of the violations.

Figure 7 depicts the distribution of the time-based metric

values across all scenarios. For TTC, MTTC, and THW calcu-

lations, the values were limited to a maximum of 10 s when

the TTC, MTTC and THW calculations resulted in larger

values, skewing the distributions of TTC and MTTC values

towards 10 s, with median values of 10 s in both cases.

Conversely, THW had a median value of 1.9 s. is speaks

towards the lack of relevance of the information provided by

TTCV and MTTCV metrics overall. In the case of PET, a

narrower distribution of values was observed with a median

of 1.5 s and a maximum of 3.7 s.

Time-To-Collision Violation (TTCV) e TTC metric is

arguably the most popular surrogate safety metric found in

the literature. erefore, it is important to understand the

meaning of a certain threshold value for TTC in relation with

the operational safety of vehicles. e TTCV threshold was

varied from 1 s to 5 s and the resulting TTC Violation duration

is shown in Figure 8. For a threshold value of 1 s, the TTCV

FIGURE 5 Minimum Safe Distance calculations boxplot with

varying parameter sets

FIGURE 6 MSDV duration boxplot with varying

parameter sets

FIGURE 7 Time-based metrics calculation boxplot with

varying threshold values

Downloaded from SAE International by Jeffrey Wishart, Monday, April 05, 2021

EVALUATION OF OPERATIONAL SAFETY ASSESSMENT (OSA) METRICS FOR AUTOMATED VEHICLES IN SIMULATION 9

duration has a maximum value of 1.3 s and a median of 1 s.

is implies that the metric likely fails to identify unsafe situ-

ations on time (i.e., 1 s before collision). For threshold values

of 2 s and 3 s, the duration of a TTCV yields a median of 1.9

s and 2.9 s, respectively. e minimum violation duration for

the 2 s threshold is 0.8 s whereas for a threshold of 3 s, the

minimum violation duration is 0.85 s. is implies that a

threshold of 3 s for TTCV might betoo conservative for real-

world driving for the scenarios studied in this work, since the

lower duration violations that happened with a threshold of

2 s were also associated with a threshold of 3 s. Beyond a

threshold of 4 s, the increasing trend of mean duration of

TTCV attens out. e maximum maintains its trend due to

the fact that a higher threshold means the activation of TTC

is much earlier during a scenario. Aer the 4 s threshold, the

metric becomes too conservative to pick up on the smaller

duration of violations.

Modiﬁed Time-To-Collision Violation (MTTCV) e

MTTC metric was introduced as a more sophisticated TTC,

considering relative distances, speeds and accelerations of the

vehicles. However, unless linked to a threshold, MTTC by

itself cannot give a sense on the severity of a situation. is is

mainly due to the fact that two vehicles under test might have

the same MTTC value for dierent combinations of relative

speeds and distances. In this work, MTTC Violations were

analyzed with thresholds varying from 1 s to 5 s and the

resulting MTTCV durations at dierent thresholds are shown

in Figure 9.

From Figure 9 weobserve that the minimum MTTCV

duration is always close to 0 (0.05 s in all cases, as this is the

time cycle of the simulation), even in cases of higher threshold

values. is highlights the sensitivity of the MTTC ca lculation

to changes in the relative speed, acceleration and distances

between the vehicles. In all cases, very short violations were

found, suggesting that MTTCV could result in multiple

conservative violations (e.g., unwarranted warnings of an

unsafe situation) regardless of the threshold value.

Furthermore, the median value of a duration for MTTCV for

a thresholds of 3, 4 and 5 seconds is biased towards the lower

values, implying shorter violation durations even for the

highest threshold. The variability demonstrated by the

MTTCV metric across dierent threshold values illustrated

some of the drawbacks of this measurement in the context of

reliably evaluating safety of a vehicle consistently

across scenarios.

Post-Encroachment Time Violation (PETV) PET is a

traditional surrogate safety measure used frequently in trac

engineering studies. By including PET in the proposed set of

metrics, the intent was to provide comparable measures to

existing datasets in addition to a metric that is relatable to

both human driven and automated vehicles. In the context of

the tested scenarios, PET was measured as the dierence in

time between a conict point for the subject and lead vehicles.

In the car following scenarios analyzed, several conict points

were dened throughout the scenarios, resulting in a PET

curve over time, rather than a single PET value for each

scenario. e latter case was the origina l intent of the PET [9].

In the cases where the subject vehicle’s path does not coincide

with the path of the lead vehicle, i.e., the rear bumper of the

lead vehicle and the front bumper of the subject vehicle passing

through the conict point, the PET value was non-existent.

e threshold for a PET violation was varied from 0.5 s to 2 s.

As depicted in Figure 10 the PET violation duration was

minimal for all t hresholds due to scenarios in the LVS category

where there is no post encroachment. In all cases, the

maximum violation duration was larger than 15 s due to

scenarios LVD_14 and LVD_15 where vehicles were 5m away

from each other initially. For thresholds 0.5 s, 1 s, 1.2 s, and

1.5 s, there is an increasing trend for the median values of the

violation duration, with values of 1 s, 2 s, 2.4 s, and 3.6 s

respectively. e variability demonstrated by the PET viola-

tion metric illustrated some of the pitfalls associated with this

measurement in the context of reliably evaluating safety of a

vehicle consistently across scenarios.

Time Headway Violation (THWV) e THW metric is

similar to the TTC metric; however, it is dened by the relative

distance between the subject vehicle and lead vehicle and only

the velocity of the subject vehicle. THW violations were evalu-

ated for varying thresholds from 1 s to 5 s across all scenarios

and the THWV duration distributions are depicted in

Figure 11.

FIGURE 8 TTCV duration boxplot with varying

threshold values

FIGURE 9 MTTCV duration boxplot with varying

threshold values

FIGURE 10 PETV metric boxplot with varying

threshold values

Downloaded from SAE International by Jeffrey Wishart, Monday, April 05, 2021

EVALUATION OF OPERATIONAL SAFETY ASSESSMENT (OSA) METRICS FOR AUTOMATED VEHICLES IN SIMULATION 10

e duration of the THWV for a threshold value of 1 s

has a median of 3 s and a maximum of 18.3 s, due to the condi-

tions on scenarios LVD_14 and LVD_15 (same as with PETV).

When the threshold is increased, the violation duration distri-

butions are skewed towards the maximum value due to the

fact that the THW calculation only considers the relative

distance between both vehicles regardless of the speed of the

lead vehicle. erefore, a THWV will happen when the ratio

between the delta in distance between vehicles and the subject

vehicle’s speed is equal to the threshold value (Δd/

speed=threshold). Since most scenarios in this work start

with a Δd = 30m and vehicles star t accelerating until reaching

their target speed (oen greater than 15m/s), a THWV will

betriggered before the subject vehicle reaches its target speed,

independent of the behavior of the lead vehicle. In these situ-

ations, a THW V triggers when the subject vehicle is travelling

at 15 m/s, 10 m/s, 7.5 m/s and 6 m/s for thresholds of 2 s, 3 s,

4 s, and 5 s respectively. is results in larger violation dura-

tions, as seen in Figu re 11.

Metrics Parameterization Based on Temporal

Occurrence As mentioned in previous sections, one of the

criteria for evaluating the metrics is to compare the results

against the DSV, considered to bethe ground truth metric.

is was done by examining the temporal occurrence of each

metric for each scenario with respect to the dened ground

truth. From this, three temporal regions were identied:

1. Metric violation prior to DSV with abrake=5 m/s2-

is indicates that the subject vehicle may not require

immediate action since a forced braking event would

not require excessive deceleration. However, if a

metric violation occurred too early compared to DSV

at 5 m/s2, this may bean overly conservative violation.

2. Metric violation between DSV with abrake=5 m /s2

and DSV with ab rake=8.3 m/s2- is indicates the

subject vehicle is in a situation in which an avoidance

maneuver may need to betaken, since a forced

braking event might require a deceleration greater

than 5 m/s2, considered to bean emergency maneuver

[10] that may not bea comfortable deceleration rate

for passengers [19]. Metric violations happening in

this temporal region may not indicate failure to

identify the safety of the situation as the DSV does

not consider the reaction time of the subject vehicle

but also does not consider the speed of the

lead vehicle.

3. Metric violation aer DSV with abrak e=8.3 m/s2-

is indicates that the subject vehicle is in a situation

in which an avoidance maneuver is recommended,

since a forced braking event might require a

deceleration rate greater than 8.3 m/s2, which was

established in [10] as an appropriate expected value

for AEB systems; thus, the deceleration rate would

exceed (or nearly exceed) the braking capability of the

AV, and a collision may not beavoided. Metric

violations happening in this temporal region may

indicate failure to identify the safety of the situation

at an appropriate time.

e results of metric violation temporal occurrences

across all scenarios at varying thresholds/parameters based

on the three regions are listed in Ta ble 8. Additionally, the

average time dierence between a metric violation and the

DSV at 5 m/s2, DSV at 8.3 m/s2, or a collision was calculated

for each threshold, depending on whether the metric violation

occurred in a temporal region of 1, 2, or 3, respectively. is

allows for analyzing the trade-os between dierent thresh-

olds and parameters sets. e optimal threshold/parameter

set should besuch that it tries to maximize violation occur-

rences in region 1 at the minimum average time prior DSV at

5 m/s2 as well as maximize the time dierence between viola-

tion occurrence and DSV at 8.3 m/s2 in region 2 and minimize

violation occurrences in region 3. Otherwise, the metric at

the given threshold could become overly conservative or not

relevant enough.

e TTCV metric with a threshold of 2 s has as many

violations as TTCV with thresholds of 3 s and 4 s in region 1

but with the lowest average time prior to DSV at 5 m/s2. e

MTTCV metric with a threshold of 2 s, while it achieved a

lower number of violations in region 1 than did higher

threshold values, has the lowest average time prior to DSV at

5 m/s2. e PETV metric with a threshold of 1.5 s achieved a

slightly higher average time prior to DSV at 5 m/s2 (only 0.13

s higher than a threshold of 1 s) but with a lower number of

metric violations within region 3. e THWV w ith a threshold

of 2 s achieves all occurrences in region 1, but the average time

of a THWV prior to DSV at 5 m/s2 of 2.4 s indicates that the

metric with the given threshold is possibly overly conservative.

Finally, the MSDV metric occurs most frequently in the rst

region with varying parameters, but the NDS parameter

category achieves the lowest average metric violation prior to

DSV at 5m/s2.

Metrics Parametrization

Selection

Based on the metrics parametrization analysis from the

previous section, the results shown in Tab le 8 in Appendix C,

and the threshold/parameter set selection process discussed

above, a threshold value or a parameter set was chosen for

each metric and the OSA metrics in the scenarios dened in

this study were evaluated. A value of 2 s was chosen for the

TTCV, MTTCV, and THWV metrics and 1.5 s for PETV, as

these values demonstrated a balance between conservativeness

and usefulness when compared to other threshold values for

each given metric. For the MSDV metric, the parameter set

from the NDS category was chosen as it demonstrated

FIGURE 11 THW metric boxplot with varying

threshold values

Downloaded from SAE International by Jeffrey Wishart, Monday, April 05, 2021

EVALUATION OF OPERATIONAL SAFETY ASSESSMENT (OSA) METRICS FOR AUTOMATED VEHICLES IN SIMULATION 11

reasonable following distances (Figure 5) at the minimum

violation duration (median of 5.1 s) with a reasonable compro-

mise between conservativeness versus usefulness compared

to the other parameter categories. It is worth noting that

parameter/threshold values may dier when evaluating the

safety performance of human-driven vehicles, compared to

that of AVs, as human drivers could have longer reaction times

than the ones proposed in this work.

Scenario Results

e time at which metric violations occurred throughout the

execution of the scenarios with regards to the collision point

are shown in Figure 12 through Figure 15. e violation

symbols on the gures depict the start of a metric violation

and the trace aer it demarcates the violation duration. us,

in scenarios where more than one violation occurred for a

given metric1, the cause was either due to the change in the

dynamics of the scenario, or an initial, overly conservative

violation. Additionally, the temporal occurrence of the ground

truth DSV is shown in each gure for both the minimum

deceleration needed to perform an emergency maneuver

(abrake=5m/s2, in orange ) and a reasonable (maximum) decel-

eration applied by automatic emergency brak ing (AEB) systems

(abrake=8. 3m/s2, in red ) for a visual comparison of the metrics’

performance in preemptive warning versus emergency

warning time needed to respond to a conict. For example, in

the LVS_10 scenario of Figure 13, the DSV at 5 m/s2 occurred

at 23.0 s and the DSV at 8.3 m/s2 occurred at 23.4 s.

As depicted in each of the gures, a collision resulted

from each scenario and the timing was dictated by scenario

dynamics, including the speeds and initial separation of the

vehicles. For example, in LVS_10 shown in Figure 13, a colli-

sion happened aer 24.0 s of scenario execution (demonstrated

by the Headway Distance curve reaching 0m at the secondary

y-axis at right) due to the lower speed of the subject vehicle

compared to scenarios LVS_15 and LVS_18, in which vehicles

collided earlier, at 18.1 s and 16.8 s, respectively. In the case

of the metrics violations, in all scenarios within the LVS

category (Figure 13), the THWV, TTCV and MTTCV

occurred at the same time (2 s before the collision), and lasted

until the collision occurred. PETV occurred at the time of

collision, which is obviously not useful as a warning. e only

metric that had a variation on its results with the dierent

scenario congurations was the MSDV metric.

For TTCV, it is worth noting that for scenarios in which

the lead vehicle is driving faster than the subject vehicle (e.g.,

LVD_16), the TTCV metric fails to provide any information

with respect to the safety of the situation during the condition

vl>vf since it results in a negative denominator for the TTC

calculation. A similar behavior happens in cases in which

vehicles are driving at the same speed, (e.g., LVD_18), a TTCV

will not occur unless one of the vehicles changes speed, and

even in that case, the TTCV might occur too late. Moreover,

when vehicles have slightly dierent speeds, like in LDV_14

and LVD_15, the vehicles drive 5m away from each other

without triggering a TTCV until the lead vehicle starts

1 is occurred for MTTCV in scenarios LVD_14 and LVD_15 and for

THWV in scenario LVD_16.

decelerating. e results on TTCV show that this metric is

not always useful, leading to numerous delayed violations in

some cases. Additionally, TTCV is not robust to variations in

the relative vehicle speeds.

MTTCV showed similar behavior to that of TTCV, oen

triggering a violation less than 1 s before TTCV. In some cases,

due to variations in the velocity and accelerations of both

vehicles throughout the scenarios, the metric experienced

changing status of violations with short durations, indicating

that MTTCV may not bea consistently reliable metric for

dierent scenarios (e.g., LVD_14). is observation would

suggest that in scenarios where vehicles may experience

minimal changes in velocity and accelerations (i.e., typical

highway car-following scenarios) other metrics may

bemore suitable.

e PETV metric did not provide releva nt safety informa-

tion in cases where the was no post encroachment (e.g.,

LVS_10, LVS_15, and LVS_18), meaning that the subject

FIGURE 12 Lead Vehicle Decelerating metrics violations

Downloaded from SAE International by Jeffrey Wishart, Monday, April 05, 2021

EVALUATION OF OPERATIONAL SAFETY ASSESSMENT (OSA) METRICS FOR AUTOMATED VEHICLES IN SIMULATION 12

vehicle did not travel through the conict point. ese can

beconsidered as analogous to “false negatives” in perception

systems. is is a huge pitfall for the robustness and usefulness

of the PETV metric in the car-following scenarios evaluated

in this work.

THWV is a metric that was not considered in the initial

proposed set of safety metrics; however, it has been evaluated

here due to its relevance in a recently published standard for

ALKS features [10]. is metric is dependent on the distance

between the subject and lead vehicles and the velocity of the

subject vehicle which may beconfounding depending on the

particular driving situation. Consider, for example, scenarios

LVD_16, LVD_18, and LVD_20. A THWV violation happens

always at the same time even when the lead vehicle is behaving

dierently (higher, same, and lower target speed than subject

vehicle, respectively). is metric produces a violation once

the subject vehicle is moving at a speed of 15 m/s or more.

Specically in the case of LVD_16, the rst THWV disappears

once the dista nce between vehicles increases as the lead vehicle

moves faster, and a second THWV appears approximately 2

s prior to collision once the lead vehicle starts decelerating.

is indicates a lack of robustness to changing scenario

congurations. One benet of the THWV metric over other

metrics, such as TTCV and PETV, is the relevance and avoid-

ance of discontinuities and negative values in the data.

e MSDV metric seemed robust to changes in scenario

dynamics providing a continuous safety assessment even

when the lead vehicle was moving faster than the subject

vehicle (e.g., LVD_16). Variations in its subjective parameters

yielded diering results, but overall, it is a continuous metric

that considers many relevant aspects for driving such as

reaction time and braking capability of the vehicles. As a

result, overly conservative violations were not seen in the

scenarios for the parameters values chosen.

e criteria introduced earlier dening the ecacy of a

metric according to robustness, relevance, and comparison

to the ground truth DSV metric was evaluated for each metric

based on the experimental results. Table 6 summarizes the

ecacy of each metric in the evaluation criteria categories

with “-” for low ecacy, “+” for medium ecacy, and “++” for

hi gh ecac y.

FIGURE 14 Lead Vehicle Moving at Lower Constant Speed

metrics violations

FIGURE 15 Lead Vehicle Accelerating metrics violations

FIGURE 13 Lead Vehicle Stopped scenarios and

metrics violations

Downloaded from SAE International by Jeffrey Wishart, Monday, April 05, 2021

EVALUATION OF OPERATIONAL SAFETY ASSESSMENT (OSA) METRICS FOR AUTOMATED VEHICLES IN SIMULATION 13

Conclusions and Future

Work

is work demonstrates t he capability of calculating the previ-

ously proposed OSA metrics through the use of simulation

and evaluates the sensitivity and interplay of these metrics for

variations in initial conditions and dierent scenarios. e

metric calculations were directly computed through the

CARLA outputs to produce graphical data that could

beanalyzed for further evaluation.

e following highlights the conclusions drawn from

evaluation of the scenarios presented in the context of this

paper and the OSA metrics renement proposed by this work:

•e TTCV metric is unreliable in cases where there is no

relative velocity dierence between the lead and

following vehicles. If the lead and subject vehicles are in

close proximity with the same velocity, the TTCV does

not provide relevant safety information. e TTC metric

has been utilized in many human driver studies

providing a common reference metric, but the

experiments in this work demonstrated several pitfalls of

the metric. As such, the TTCV metric may only

berelevant within the context of an OSA methodology

for comparison purposes to previous trac engineering

research and naturalistic studies.

•e MTTCV metric is sensitive to changes in the

accelerations of the vehicles involved in the simulated

scenarios. MTTC violations occurred frequently with

short durations at high threshold values, indicating high

sensitivity to any variation in the vehicle position, velocity,

and accelerations. As such, the MTTCV metric does not

provide reliable and robust notication of a collision and

may not bean adequate metric for the scenarios evaluated

in this paper. As a result, the MTTCV metric may not

berelevant for a nalized OSA metrics set.

•e PETV metric may bemore applicable to intersection

scenarios; yet, overall is an ex post facto metric that in

many cases will not provide a continuous, real-time

measurement of a given situation and is more useful in

reactive assessments. erefore, PETV may not bean

adequate metric for the scenarios evaluated in this paper

and may not berelevant in the nalized OSA metrics set.

•e THWV metric utilizes the relative distance of the

involved vehicles and the speed of the following vehicle;

however, it does not consider any information regarding

the dynamics of the lead vehicle. As a result, the THWV

metric can lead to confounding results depending on the

situation where the parameters of the lead vehicle can

play a major role in the context of the scenario (i.e., lead

vehicle stopped). erefore, THWV may not bean

adequate metric for the scenarios evaluated in this paper

and may not berelevant in the nalized OSA metrics set.

•e MSDV metric is subject to the parameters used in its

formulation; however, research has been conducted in

order to propose sets of parameters that follow

naturalistic driving behavior from humans [15]. e use

of naturalistic studies and generalized vehicle dynamic

capabilities provides a more comprehensive metric

which incorporates the complexity of the physical

attributes of the vehicles. e MSDV formulation

accounts for “what if ” worst-case situations, making it

more robust to changes in the dynamics of the situation

without being an overly conservative measure; while the

time-based metrics are based solely on relative

measurements of vehicle motion including relative

position, velocity, and acceleration.

With respect to future work, there is a need for optimiza-

tion of the selection process of the thresholds/parameter sets

of the OSA metrics. A deeper understanding of violation

duration and temporal occurrence will allow for the selection

process to determine the optimal threshold or parameter set.

is is important for comparing the results for optimized

metrics, which will lead to renement and nalization of the

OSA metrics set.

Additional scenarios will also beevaluated in future

work. is paper focused on car-following scenarios; but the

use of the script and database to calculate the metrics for any

given scenario will a llow for others to beconsidered. Although

the current work examines scenarios composing approxi-

mately 18.4% of all light-duty vehicle collisions in the U.S.,

further simulation work should beconducted to consider

additional scenarios, such as intersections or lane changing

scenarios. e analysis of the additional scenarios will provide

further insight into the relative merits of the OSA metrics,

with the intent to further validate the set and remove any

metrics that are not necessary to the OSA methodology

under development.

To facilitate the OSA metrics set nalization, there is a

need to automate the process of calculating all applicable

metrics proposed within [3]with inputs of the required param-

eters. Although the CARLA script was modied to calculate

and output the safety metric results independently, a script

and database combination is planned to establish a repository

and accompanying methodology capable of calculating the

metrics from any simulation software that is capable of

outputting the necessary variables rather than limiting the

scope specically to CARLA.

One such soware that is planned to beused for future

work is Human, Vehicle, Environment (HVE) which is a

physics-based accident reconstruction soware traditionally

used in the evaluation of vehicle dynamics in collision

scenarios. Although CARLA provided a useful platform to

evaluate the discussed metrics, collision incidents were not

evaluated. CARLA is capable of providing the timing for colli-

sion incidents within the scenarios; however, one limitation

of CARLA is the inability of the soware to handle the colli-

sion and post-collision vehicle dynamics associated with an

TAB L E 6 Evaluation of OSA metrics based on

presented criteria

Criteria TTCV MTTCV PETV THWV MSDV

Robustness + + + + ++

Relevance - + - ++ ++

DSV Comparison - - - ++ +

Downloaded from SAE International by Jeffrey Wishart, Monday, April 05, 2021

EVALUATION OF OPERATIONAL SAFETY ASSESSMENT (OSA) METRICS FOR AUTOMATED VEHICLES IN SIMULATION 14

impact. e simulation events were terminated at the initia-

tion of a collision event since the vehicles are not modeled to

a level of delity that is capable of accurately calculating the

physics of the vehicles including rotation, crush, and exit

velocities associated with the momentum and energy charac-

teristics of the collision. e proposed Collision Incident (CI)

metric in [3] utilizes the KABCO index to determine the

severity of the collision. e use of HVE could enhance the

severity quantication for any given scenario by providing

information such as delta-v, dissipated energy, and principle

direction of force (PDOF). is renement would provide

additional granularity to not only consider whether a collision

occurs but also understand the severity and dynamics of a

collision if one does occur.

When the OSA metrics have been nalized and validated,

the OSA methodology will bedeveloped as part of an overall

SCF. is will allow for a score to beassigned to the navigation

of any given scenario by an AV (or human-driven vehicle) that

will bea part of the AV safety case.

References

1. Safetyengineering.wordpress.com, “e Safety Engineering

Resource,” April 18, 2008, accessed Jan. 8, 2021.

2. Underwriters Laboratories (UL), ANSI/UL 4600- Standard

for Safety For the Evaluation of Autonomous Products, 2020.

3. Wishart, J., Como, S., Elli, M., Russo, B. et al., “Driving

Safety Performance Assessment Metrics for ADS-Equipped

Vehicles,” SAE Technical Paper 2020-01-1206, 2020. https://

doi.org/10.4271/2020-01-1206.

4. Dosovitskiy, A., Ros, G., Codevilla, F., Lopez, A., and Koltun,

V., “CARLA: An Open Urban Driving Simulator,” in 1st

Annual Conference on Robot Learning, 2 017.

5. “ScenarioRunner,” https://github.com/carla-simulator/

scenario_runner, accessed Nov. 1, 2020.

6. Wassim, N., Ranganathan, R., Srinivasan, G., Smith, J.,

Toma, S., Swanson, E., and Burgett, A., “Description of

Light-Vehicle to-Vehicle Communications for Safety

Applications Based on Vehicle-to-Vehicle Communications,”

National Highway Trac Safety Administration, Report No.

DOT HS 811 731, 2013.

7. Najm, W.G., Smith, J.D., and Yanagisawa, M., “Pre-Crash

Scenario Typology for Crash Avoidance Research,” National

Highway Trac Safety Administration, Report No. DOT-

VNTSC-NHTSA-06-02, 2007.

8. Shalev-Shwartz, S., Shammah, S., and Shashua, A., “On a

Formal Model of Safe and Scalable Self-Driving Cars,”

arXiv:1708 .0 6374, 2017.

9. Gettman, D. and Head, L., “Surrogate Safety Measures from

Trac Simulation Models,” 1840(1):104-115, 2003.

10. United Nations Economic Commission for Europe

(UNECE), “Proposal for a New UN Regulation on Uniform

Provisions Concerning the Approval of Vehicles with

Regards to Automated Lane Keeping System,” 2020, https://

undocs.org/ECE/TRANS/WP.29/2020/81.

11. Silberling, J., Wells, P., Acharya, A., Kelly, J., and Lenkeit, J.,

“Development and Application of a Collision Avoidance

Capability Metric,” SAE Technical Paper 2020-01-1207, 2020.

https://doi.org/10.4271/2020-01-1207.

12. Weng, B., Rao, S., Deosthale, E., Schnelle, S., and Barickman,

F., “Model Predictive Instantaneous Safety Metric for

Evaluation of Automated Driving Systems,” in IEEE

Intelligent Vehicles Symposium (IV), 2020.

13. Javed, M.A. and Khan, J.Y., “Performance Analysis of a Time

Headway Based Rate Control Algorithm for VANET Safety

Applications,” in in 7th International Conference on Signal

Processing and Communication Systems (ICSPCS), Carrara,

VIC , 2013.

14. Rodionova, A., Alvarez, I., Elli, M.S., Oboril, F., Quast, J.,

and Mangharam, R., “How Safe Is Safe Enough? Automatic

Safety Constraints Boundary Estimation for Decision-

Making in Automated Vehicles,” in in IEEE Intelligent

Vehicles Symposium (IV), 2020.

15. China Intelligent Transportation Systems (ITS) Alliance,

“Safety Assurance Technical Requirements for Decision-

Making on Autonomous Vehicles,” C-ITS Alliance

Report, 2020.

16. ASAM , “OpenScenario,” https://ww w.asam.net/standards/

detail/openscenario/, accessed Jan. 11, 2020.

17. Gassmann, B., Oboril, F., Buerkle, C., Liu, S. et al., “Towards

Standardization of AV Safety: C++ Library for Responsibility

Sensitive Safety,” in IEEE Intelligent Vehicles Symposium

(IV), 2019.

18. Gassmann, B., Pasch, F., Oboril, F., and Scholl, K.-U.,

“Integration of Formal Safety Models on System Level Using

the Example of Responsibility Sensitive Safety and CARLA

Driving Simulator,” in International Conference on Computer

Safety, Reliability, and Security, 2020.

19. Mazzae, E.N., Barickman, F.S., Forkenbrock, G., and

Baldwin, G.H., “NHTSA Light Vehicle Antilock Brake

System Research Program Task 5.2/5.3: Test Track

Examination of Drivers’ Collision Avoidance Behavior Using

Conventional and Antilock Brakes,” DOT HS809, 2003.

20. Balas, V.E. and Balas, M.M., Driver Assisting by Inverse Time

to Collision (World Automation Congress (WAC):

Budapest, 2006).

Acknowledgement

is work was made possible by the generous contributions

and funding provided by the Institute for Automated Mobility

(IAM). e authors would like to thank the IAM for its

continued support in advancing research surrounding

vehicle automation.

Deﬁnitions/Abbreviations

ABC - Achieved Behavioral Competency

AD - Aggressive Driving

ADS - Automated Driving System

ADSA - ADS Active

Downloaded from SAE International by Jeffrey Wishart, Monday, April 05, 2021

EVALUATION OF OPERATIONAL SAFETY ASSESSMENT (OSA) METRICS FOR AUTOMATED VEHICLES IN SIMULATION 15

AEB - Automatic Emergency Braking

AHJ - Authority Having Jurisdiction

ALKS - Automated Lane Keeping System

AV - ADS-Equipped Vehicle

CAC - Collision Avoidance Capability

CARLA - Car Learning to Act

CI - Collision Incident

HTCDER - Human Trac Control Detection Error Rate

HTCVR - Human Trac Control Violation Rate

IAM - Institute of Automated Mobility

MPrISM - Model Predictive Instantenous Safety Metric

MSD - Minimum Safe Distance

MSDCE - Minimum Safe Distance Calculation Error

MSDF - Minimum Safe Distance Factor

MSDV - Minimum Safe Distance Violation

MTTC - Modied Time-to-Collision

MTTCV - Modied Time-to-Collision Violation

NHTSA - National Highway Trac Safety Administration

ORAD - On-Road Automated Driving

OSA - Operational Safety Assessment

PET - Post Encroachment Time

PETV - Post Encroachment Time Violation

PRA - Proper Response Action

RSS - Responsibility-Sensitive Safety

SCF - Safety Case Framework

THW - Time Headway

THWV - Time Headway Violation

TLW - Trac Law Violation

TTC - Time-to-Collision

TTCV - Time-to-Collision Violation

V&V - Verication and Validation

Downloaded from SAE International by Jeffrey Wishart, Monday, April 05, 2021

EVALUATION OF OPERATIONAL SAFETY ASSESSMENT (OSA) METRICS FOR AUTOMATED VEHICLES IN SIMULATION 16

Appendix A Minimum Longitudinal and Lateral

Distances of MSDV (from [])

long

accel

long

min

,max,

1111

111

ρρ

ccel

long

decel

long

decel

long

()

−

()







,min,

,max,











+

(7)

Where the subject vehicle (subscript 1) is following behind another entity (subscript 2) and both are moving in the same

direction, and [x]+≔max{x, 0}.

lataccel

lat

lataccel

min

,max,

a=+

111

ρρ

ll at

decel

lat

lataccel

lat

()

−

−−

222

,min,

,max,

ρρ

accel

lat

decel

lat

,max,

,min,

()





























(8)

Where the subject vehicle (subscript 1) is to the le of the other entity (subscript 2),

dlat

min

is the distance between the right

side of the ego vehicle and the le side of the other entity, and μ is a lateral uctuation margin [m] , and [x]+≔ma x{x, 0}.Appendix

B: Pre-Crash Scenario Category Summary

Appendix C Temporal Occurrence Results

TAB L E 7 Pre-crash scenario categories and descriptions for two-vehicle crashes from [6]

Scenario Category Description

Scenario

Category ID

Proportion of

Collisions1

Lead Vehicle Stopped Subject vehicle is traveling straight in an urban area, in daylight, under

clear weather conditions, at an intersection-related location with a

posted speed limit of 35 mph and approaches a stopped lead vehicle.

LVS 10.2%

Lead Vehicle Decelerating Subject vehicle is traveling straight and following a lead vehicle in a rural

area, in daylight, under clear weather conditions, at a non-junction with a

posted speed limit of 55 mph or more, and the lead vehicle suddenly

decelerates.

LVD 4.2%

Lead Vehicle Moving at Lower

Constant Speed

Subject vehicle is traveling straight in an urban area, in daylight, under

clear weather conditions, at a non-junction with a posted speed limit of

55 mph or more; and approaches a lead vehicle moving at lower

constant speed.

LVMLCS 3.7%

Lead Vehicle Accelerating Subject vehicle is traveling straight in an urban area, in daylight, under

clear weather conditions, at an intersection-related location with a

posted speed limit of 45 mph and approaches an accelerating lead

vehicle.

LVA 0.3%

Total: 18.4%

* The relative frequency is based on the collision statistics from Table 7 in [6].

Downloaded from SAE International by Jeffrey Wishart, Monday, April 05, 2021

electronic, me chanic al, photo copying, recording, or other wise, without the prior writ ten permission of SAE International.

Positions and opinions adva nced in this work are those of the author(s) and not necessarily those of SAE International. Responsibility for the content of the work lies

solely with the author(s).

ISSN 0148-7191

EVALUATION OF OPERATIONAL SAFETY ASSESSMENT (OSA) METRICS FOR AUTOMATED VEHICLES IN SIMULATION

TAB L E 8 Temporal occurrence results of metrics violations at varying thresholds across all scenarios*

Metric Threshold

# Violations

Prior to DSV

at 5 m/s^2

Avg. Time

Dierence of

Violation and DSV

at 5 m/s2 [s]

# Violations

Between DSV at

5 and 8.3 m/s2

Avg. Time Dierence

of Violation and DSV

at 8.3 m/s2 [s]

# Violations

After DSV at

8.3 m/s2

Avg. Time

Dierence of

Violation and

Collision [s]

MDSV Aggressive 13 1.27 0 - 0 -

NDS 9 0.58 4 2.39 0 -

Conservative 13 7.6 5 0 - 0 -

TTCV 1 s 0 - 3 0.20 10 0.91

2 s 4 0.48 1 0.20 8 1.73

3 s 4 1.36 3 0.43 6 2.20

4 s 4 2.21 3 1.28 6 2.67

5 s 8 2.03 20.93 52.76

MTTCV 1 s 2 1.75 40.26 10 0.94

2 s 6 0.97 1 0.85 8 1.84

3 s 14 6.16 7 0.58 7 2.54

4 s 23 4.72 31.17 74.61

5 s 22 5.57 9 1.02 5 6.14

THWV 1 s 2 0.75 60.51 5 2.68

2 s 14 2.43 0 - 0 -

3 s 13 4.63 0 - 0 -

4 s 13 5.51 0 - 0 -

5 s 13 6.09 0 - 0 -

PETV 0.5 s 0 - 0 - 13 3.25

1 s 2 0.45 10.95 10 1.33

1.2 s 2 0.55 11.95 10 1.70

1.5 s 3 0.58 5 0 .74 50.13

2 s 8 2.66 2 1.35 40.03

*Note that the threshold/parameter sets selected for the scenario analysis are in bold

Downloaded from SAE International by Jeffrey Wishart, Monday, April 05, 2021

Rath Wishart et al 2024 - Evaluating Safety Metrics for VRUs at Urban Traffic Intersections using Infrastructure LIDAR

Conference Paper

Full-text available

Apr 2024

div class="section abstract"> Ensuring the safety of vulnerable road users (VRUs) such as pedestrians, users of micro-mobility vehicles, and cyclists is imperative for the commercialization of automated vehicles (AVs) in urban traffic scenarios. City traffic intersections are of particular concern due to the precarious situations VRUs often encounter when navigating these locations, primarily because of the unpredictable nature of urban traffic. Earlier work from the Institute of Automated Vehicles (IAM) has developed and evaluated Driving Assessment (DA) metrics for analyzing car following scenarios. In this work, we extend those evaluations to an urban traffic intersection testbed located in downtown Tempe, Arizona. A multimodal infrastructure sensor setup, comprising a high-density, 128-channel LiDAR and a 720p RGB camera, was employed to collect data during the dusk period, with the objective of capturing data during the transition from daylight to night. In this study, we present and empirically assess the benefits of high-density LiDAR in low-light and dark conditions—a persistent challenge in VRU detection when compared to traditional RGB traffic cameras. Robust detection and tracking algorithms were utilized for analyzing VRU-to-vehicle and vehicle-to-vehicle interactions using the LiDAR data. The analysis explores the effectiveness of two DA metrics based on the i.e. Post Encroachment Time (PET) and Minimum Distance Safety Envelope (MDSE) formulations in identifying potentially unsafe scenarios for VRUs at the Tempe intersection. The codebase for the data pipeline, along with the high-density LiDAR dataset, has been open-sourced with the goal of benefiting the AV research community in the development of new methods for ensuring safety at urban traffic intersections. </div

Lu Wishart et al 2024 - Validation and Analysis of DA Metrics in Real-World Car-Following Scenarios w Aerial Videos

Conference Paper

Full-text available

Apr 2024

Data-driven driving safety assessment is crucial in understanding the insights of traffic accidents caused by dangerous driving behaviors. Meanwhile, quantifying driving safety through well-defined metrics in real-world naturalistic driving data is also an important step for the operational safety assessment of automated vehicles (AV). However, the lack of flexible data acquisition methods and fine-grained datasets has hindered progress in this critical area. In response to this challenge, we propose a novel dataset for driving safety metrics analysis specifically tailored to car-following situations. Leveraging state-of-the-art Artificial Intelligence (AI) technology, we employ drones to capture high-resolution video data at 12 traffic scenes in the Phoenix metropolitan area. After that, we developed advanced computer vision algorithms and semantically annotated maps to extract precise vehicle trajectories and leader-follower relations among vehicles. These components, in conjunction with a set of defined metrics based on our prior work on Operational Safety Assessment (OSA) by the Institute of Automated Mobility (IAM), allow us to conduct a detailed analysis of driving safety. Our results reveal the distribution of these metrics under various real-world car-following scenarios and characterize the impact of different parameters and thresholds in the metrics. By enabling a data-driven approach to address driving safety in car-following scenarios, our work can empower traffic operators and policymakers to make informed decisions and contribute to a safer, more efficient future for road transportation systems.

A Diversity Analysis of Safety Metrics Comparing Vehicle Performance in the Lead-Vehicle Interaction Regime

Article

Full-text available

Jul 2023

Vehicle performance metrics analyze data sets consisting of subject vehicle’s interactions with other road users in a nominal driving environment and provide certain performance measures as outputs. To the best of the authors’ knowledge, the vehicle safety performance metrics research dates back to at least 1967. To date, there still does not exist a community-wide accepted metric or a set of metrics for vehicle safety performance assessment and justification. This issue gets further amplified with the evolving interest in Advanced Driver Assistance Systems and Automated Driving Systems. In this paper, the authors seek to perform a unified study that facilitates an improved community-wide understanding of vehicle performance metrics using the lead-vehicle interaction operational design domain as a common means of performance comparison. In particular, the authors study the diversity (including constructive formulation discrepancies and empirical performance differences) among 33 base metrics with up to 51 metric variants (with different choices of hyper-parameters) in the existing literature, published between 1967 and 2022. Two data sets are adopted for the empirical performance diversity analysis, including vehicle trajectories from normal highway driving environment and relatively high-risk incidents with collisions and near-miss cases. The analysis further implies that (i) the conceptual acceptance of a safety metric proposal can be problematic if the assumptions, conditions, and types of outcome assurance are not justified properly, and (ii) the empirical performance justification of an acceptable metric can also be problematic as a dominant consensus is not observed among metrics empirically.

A Diversity Analysis of Safety Metrics Comparing Vehicle Performance in the Lead-Vehicle Interaction Regime

Preprint

Jun 2023

Vehicle performance metrics analyze data sets consisting of subject vehicle's interactions with other road users in a nominal driving environment and provide certain performance measures as outputs. To the best of the authors' knowledge, the vehicle safety performance metrics research dates back to at least 1967. To date, there still does not exist a community-wide accepted metric or a set of metrics for vehicle safety performance assessment and justification. This issue gets further amplified with the evolving interest in Advanced Driver Assistance Systems and Automated Driving Systems. In this paper, the authors seek to perform a unified study that facilitates an improved community-wide understanding of vehicle performance metrics using the lead-vehicle interaction operational design domain as a common means of performance comparison. In particular, the authors study the diversity (including constructive formulation discrepancies and empirical performance differences) among 33 base metrics with up to 51 metric variants (with different choices of hyper-parameters) in the existing literature, published between 1967 and 2022. Two data sets are adopted for the empirical performance diversity analysis, including vehicle trajectories from normal highway driving environment and relatively high-risk incidents with collisions and near-miss cases. The analysis further implies that (i) the conceptual acceptance of a safety metric proposal can be problematic if the assumptions, conditions, and types of outcome assurance are not justified properly, and (ii) the empirical performance justification of an acceptable metric can also be problematic as a dominant consensus is not observed among metrics empirically.

OMalley Wishart et al 2024 - A Scenario-Based Test Selection and Scoring Methodology for Inclusion in a Safety Case Framework for Automated Vehicles

Conference Paper

Full-text available

Apr 2024

div class="section abstract"> Effectively determining automated driving system (ADS)-equipped vehicle (AV) safety without relying on testing an infeasibly large number of driving scenarios is a challenge with wide recognition in industry and academia. The following paper builds on previous work by the Institute of Automated Mobility (IAM) and Science Foundation Arizona (SFAz), and proposes a test selection and scoring methodology (TSSM) as part of a safety case-based framework being developed by the SFAz to ensure the safety of AVs while addressing the scenario testing challenge. The TSSM is an AV verification and validation (V&V) process that relies, in part, on iterative, partially random generation of AV driving scenarios. These scenarios are generated using an operational design domain (ODD) and behavioral competency portfolio, which expresses the vehicle ODD and behavioral competencies in terms of quantifiable amounts or intensities of discrete components. Once generated, these scenarios are subjected to filters based on their relevance to the AV ODD and behavioral competency portfolio that preserves the robustness of the generated test set; after filtration, scenarios are assigned to a test method and executed. Further, these scenarios may be generated entirely by the TSSM or may be drawn from a preexisting scenario database and subjected to the same filtration process. After the scenarios assembled by the TSSM are executed, the methodology aggregates their driving assessment (DA) scores into a single numerical value. We outline the overall safety case-based framework, the TSSM, including its role in the framework as well as planned future work, and outline two proofs of concept: (1) a demonstration of the ability of the TSSM to pare down the space of scenarios in a scenario database; and (2) a specification form which may be used to solicit a description of the AV ODD and behavioral competency portfolio from the AV developer. </div

Critical Scenario Identification Concept: The Role of the Scenario-in-the-Loop Approach in Future Automotive Testing

Article

Full-text available

Jan 2023

Zsolt Szalay

Innovative testing and validation methods are prerequisites concerning Connected, Cooperative, and Automated Mobility (CCAM) as the high number of cooperating participants and concurrent processes critically increase the probability of adverse safety and security incidents. The proposed new approaches deal with this increasing complexity of not currently having generally accepted validation mechanisms. The paper introduces a novel, mathematical model based, scenario identification methodology, facilitating the selection of critical road vehicle traffic scenarios, taking into account different testing objectives, such as maximizing the safety risk of the analyzed system. The presented results verify that applying specific decision models and quantifiable indicators related to the system elements of highly automated mobility systems can significantly contribute to the systematic identification of unsafe corner cases in connected and cooperative autonomous systems.

CAROM Air -- Vehicle Localization and Traffic Scene Reconstruction from Aerial Videos

Preprint

Full-text available

May 2023

Road traffic scene reconstruction from videos has been desirable by road safety regulators, city planners, researchers, and autonomous driving technology developers. However, it is expensive and unnecessary to cover every mile of the road with cameras mounted on the road infrastructure. This paper presents a method that can process aerial videos to vehicle trajectory data so that a traffic scene can be automatically reconstructed and accurately re-simulated using computers. On average, the vehicle localization error is about 0.1 m to 0.3 m using a consumer-grade drone flying at 120 meters. This project also compiles a dataset of 50 reconstructed road traffic scenes from about 100 hours of aerial videos to enable various downstream traffic analysis applications and facilitate further road traffic related research. The dataset is available at https://github.com/duolu/CAROM.

Querying Labeled Time Series Data with Scenario Programs

Preprint

Full-text available

Jun 2024

Devan Shanker

In order to ensure autonomous vehicles are safe for on-road deployment, simulation-based testing has become an integral complement to on-road testing. The rise in simulation testing and validation reflects a growing need to verify that AV behavior is consistent with desired outcomes even in edge case scenarios $-$ which may seldom or never appear in on-road testing data. This raises a critical question: to what extent are AV failures in simulation consistent with data collected from real-world testing? As a result of the gap between simulated and real sensor data (sim-to-real gap), failures in simulation can either be spurious (simulation- or simulator-specific issues) or relevant (safety-critical AV system issues). One possible method for validating if simulated time series failures are consistent with real world time series sensor data could involve retrieving instances of the failure scenario from a real-world time series dataset, in order to understand AV performance in these scenarios. Adopting this strategy, we propose a formal definition of what constitutes a match between a real-world labeled time series data item and a simulated scenario written from a fragment of the Scenic probabilistic programming language for simulation generation. With this definition of a match, we develop a querying algorithm that identifies the subset of a labeled time series dataset matching a given scenario. To allow this approach to be used to verify the safety of other cyber-physical systems (CPS), we present a definition and algorithm for matching scalable beyond the autonomous vehicles domain. Experiments demonstrate the precision and scalability of the algorithm for a set of challenging and uncommon time series scenarios identified from the nuScenes autonomous driving dataset. We include a full system implementation of the querying algorithm freely available for use across a wide range of CPS.

Method for Comparison of Surrogate Safety Measures in Multi-Vehicle Scenarios

Conference Paper

Jun 2023

Assessing the Criticality of Longitudinal Driving Scenarios using Time Series Data

Preprint

May 2023

Nico Schick

Unfortunately, many people die in car accidents. To reduce these accidents, cars are equipped with driving safety systems. With autonomous vehicles, the driver's behavior becomes irrelevant as the car drives autonomously. All autonomous driving algorithms must undergo extensive testing and validation, especially for safety-critical scenarios. Therefore, the detection of safety-critical driving scenarios is essential for autonomous vehicles. This publication describes safety indicator metrics based on time series covering longitudinal driving data to detect safety-critical driving scenarios.

Driving Safety Performance Assessment Metrics for ADS-Equipped Vehicles

Conference Paper

Full-text available

Apr 2020

div class="section abstract"> The driving safety performance of automated driving system (ADS)-equipped vehicles (AVs) must be quantified using metrics in order to be able to assess the driving safety performance and compare it to that of human-driven vehicles. In this research, driving safety performance metrics and methods for the measurement and analysis of said metrics are defined and/or developed. A comprehensive literature review of metrics that have been proposed for measuring the driving safety performance of both human-driven vehicles and AVs was conducted. A list of proposed metrics, including novel contributions to the literature, that collectively, quantitatively describe the driving safety performance of an AV was then compiled, including proximal surrogate indicators, driving behaviors, and rules-of-the-road violations. These metrics, which include metrics from on- and off-board data sources, allow the driving safety performance of an AV to be measured in a variety of situations, including crashes, potential conflicts, and near misses. These measurements enable the evaluation of temporal flows and the quantification of key aspects of driving safety performance. The identification and exploration of metrics focusing explicitly on AVs as well as proposing a comprehensive set of metrics is a unique contribution to the literature. The objective is to develop a concise set of metrics that allow driving safety performance assessments to be effectively made and that align with the needs of both the ADS development and transportation engineering communities and accommodate differences in cultural/regional norms. Concurrent project work includes equipping an intersection with a sensor suite of cameras, LIDAR, and RADAR to collect data requiring off-board sources and employing test AVs to collect data requiring on-board sources. Additional concurrent work includes development of artificial intelligence and computer vision-based algorithms to automatically calculate the metrics using the collected data. Future work includes using the collected data and algorithms to finalize the list of metrics and then develop a methodology that uses the metrics to provide an overall driving safety performance assessment score for an AV. </div

Performance analysis of a time headway based rate control algorithm for VANET safety applications

Conference Paper

Full-text available

Dec 2013

Vehicular ad hoc network is considered as an integral part of the future intelligent transportation system. As the number of applications which are supported by the vehicular communication grow, the efficient utilization of the control channel and congestion control become important issues. The periodic broadcast of basic safety messages (BSM) by the vehicles consumes most of the control channel interval, leaving less transmission capacity for other types of traffic. To accommodate the data packets from other safety and non-safety applications, the packet transmission rate of BSM must be controlled without compromising the safety of the vehicles. In this paper, we present a BSM generation rate control algorithm based on the measured time headway of the vehicles. The performance analysis shows that the proposed rate control algorithm reduces the channel utilization and improves the BSM reception ratio at different vehicle densities and vehicle speeds. Moreover, the proposed rate control algorithm effectively reduces the notification time of a multi-hop warning message.

Driver Assisting by Inverse Time to Collision

Conference Paper

Full-text available

Aug 2006

The paper is proposing a specific indicator, the inverse time to collision TTC<sup>-1</sup>, useful when analyzing the highway traffic. The advantage of TTC<sup>-1</sup> vs. TTC is a direct and continuous dependence with the collision risk. TTC<sup>-1</sup> could be used as an input in car following algorithms. Because the automate driving is yet in a research stage, a feasible application for TTC<sup>-1</sup> would be rather assisting the driver of the following car at the choice of the distance gap towards the first car.

How safe is safe enough? Automatic Safety Constraints Boundary Estimation for Decision-Making in Automated Vehicles

Conference Paper

Oct 2020

Model Predictive Instantaneous Safety Metric for Evaluation of Automated Driving Systems

Conference Paper

Oct 2020

Development and Application of a Collision Avoidance Capability Metric

Conference Paper

Apr 2020

Towards Standardization of AV Safety: C++ Library for Responsibility Sensitive Safety

Conference Paper

Aug 2019

The need for safety in Automated Driving (AD) is becoming increasingly critical with the accelerating deployment of this technology. Beyond functional safety, industry must guarantee the operational safety of automated vehicles. Towards that end, Mobileye introduced the Responsibility Sensitive Safety (RSS), a model-based approach to Safety [1]. In this paper we expand upon this work introducing the C ++ Library for Responsibility Sensitive Safety, an open source executable that implements a subset of RSS. We provide architectural details to integrate the C ++ Library for Responsibility Sensitive Safety with AD Software pipelines as safety module overseeing decision making of driving policies. We illustrate this application with an example integration with the Baidu Apollo AD stack and simulator, [2] and [3], that provides safety validation of the planning module. Furthermore, we show how the C ++ Library for Responsibility Sensitive Safety can be used to explore the usefulness of the RSS model through parameter exploration and analysis on minimum safe longitudinal distance, (dmin), considering different weather conditions. We also compare these results with half-of-speed rule followed in some parts of the world. We expect that the C ++ Library for Responsibility Sensitive Safety becomes a critical component of future tools for formal verification, testing and validation of AD safety and that it helps bootstrap the AD research efforts towards standardization of safety.

Surrogate safety measures from traffic simulation models

Article

Jan 1840

Safety is emerging as an area of increased attention and awareness within transportation engineering. Historically, the safety of new and innovative traffic treatments has been difficult to assess, primarily because of a lack of good predictive models of crash potential and a lack of consensus on what constitutes a safe or unsafe facility. An FHWA-sponsored research project investigated the potential to derive surrogate measures of safety from existing traffic simulation models. These surrogate measures could then be used to support evaluations of various traffic engineering alternatives, including facilities that have not yet been built and strategies that have not yet been used. Each surrogate measure is collected on the basis of the occurrence of a conflict event, which is an interaction between two vehicles in which one vehicle must take evasive action to avoid a collision. The surrogate measures that are proposed as the best are time to collision, postencroachment time, deceleration rate, maximum speed, and speed differential. Time to collision, postencroachment time, and deceleration rate can be used to measure the severity of the conflict. Maximum speed and the speed differential can be used to measure the severity of the potential collision (by use of additional information about the mass of the vehicles involved to assess momentum). After the simulation model is executed for a number of iterations, a postprocessing tool would be used to compute the statistics for the various measures and perform comparisons between design alternatives.

The Safety Engineering Resource

Apr 2008

Wordpress Safetyengineering
Com

Safetyengineering.wordpress.com, "The Safety Engineering Resource," April 18, 2008, accessed Jan. 8, 2021.

CARLA: An Open Urban Driving Simulator

Jan 2017

A Dosovitskiy
G Ros
F Codevilla
A Lopez
V Koltun

Dosovitskiy, A., Ros, G., Codevilla, F., Lopez, A., and Koltun, V., "CARLA: An Open Urban Driving Simulator," in 1st Annual Conference on Robot Learning, 2017.

Evaluation of Operational Safety Assessment (OSA) Metrics for Automated Vehicles in Simulation

Abstract and Figures

Recommended publications

Evaluating the Severity of Safety Envelope Violations in the Proposed Operational Safety Assessment...

Evaluation of Operational Safety Assessment (OSA) Metrics for Automated Vehicles in Simulation

Sensitivity of Automated Vehicle Operational Safety Assessment (OSA) Metrics to Measurement and Para...

Evaluating Automated Vehicle Scenario Navigation Using the Operational Safety Assessment (OSA) Metho...