ArticlePDF Available

Can Data Generated by Connected Vehicles Enhance Safety?: Proactive Approach to Intersection Safety Management

Authors:

Abstract and Figures

Traditionally, evaluation of intersection safety has been largely reactive, based on historical crash frequency data. However, the emerging data from Connected and Automated Vehicles (CAVs) can complement historical data and help in proactively identify intersections which have high levels of variability in instantaneous driving behaviors prior to the occurrence of crashes. Based on data from Safety Pilot Model Deployment in Ann Arbor, Michigan, this study developed a unique database that integrates intersection crash and inventory data with more than 65 million real-world Basic Safety Messages logged by 3,000 connected vehicles, providing a more complete picture of operations and safety performance of intersections. As a proactive safety measure and a leading indicator of safety, this study introduces location-based volatility (LBV), which quantifies variability in instantaneous driving decisions at intersections. LBV represents the driving performance of connected vehicle drivers traveling through a specific intersection. As such, by using coefficient of variation as a standardized measure of relative dispersion, LBVs are calculated for 116 intersections in Ann Arbor. To quantify relationships between intersection-specific volatilities and crash frequencies, rigorous fixed- and random-parameter Poisson regression models are estimated. While controlling for exposure related factors, the results provide evidence of statistically significant (5% level) positive association between intersection-specific volatility and crash frequencies for signalized intersections. The implications of the findings for proactive intersection safety management are discussed in the paper.
Content may be subject to copyright.
Kamrani, Wali, & Khattak 1
TRB PAPER # 17-00238
Can Data Generated by Connected Vehicles Enhance Safety?
A proactive approach to intersection safety management
Mohsen Kamrani
Graduate Research Assistant, Department of Civil & Environmental Engineering
The University of Tennessee
mkamrani@vols.utk.edu
Behram Wali
Graduate Research Assistant, Department of Civil & Environmental Engineering
The University of Tennessee
bwali@vols.utk.edu
Asad J. Khattak, Ph.D.
Beaman Distinguished Professor, Department of Civil & Environmental Engineering
The University of Tennessee
akhattak@utk.edu
March, 2017
Text Count: 6457, Tables and Figures: 4, Total Word Count: 7457
Final Version Submitted for Publication to the 96th Annual Meeting
Transportation Research Board
January 2017
Washington, D.C.
Kamrani, Wali, & Khattak 2
Can Data Generated by Connected Vehicles Enhance Safety?
A proactive approach to intersection safety management
Mohsen Kamrani, Behram Wali, Asad J. Khattak
The University of Tennessee, Knoxville
Abstract – Traditionally, evaluation of intersection safety has been largely reactive, based
on historical crash frequency data. However, the emerging data from Connected and
Automated Vehicles (CAVs) can complement historical data and help in proactively
identify intersections which have high levels of variability in instantaneous driving
behaviors prior to the occurrence of crashes. Based on data from Safety Pilot Model
Deployment in Ann Arbor, Michigan, this study developed a unique database that
integrates intersection crash and inventory data with more than 65 million real-world Basic
Safety Messages logged by 3,000 connected vehicles, providing a more complete picture
of operations and safety performance of intersections. As a proactive safety measure and a
leading indicator of safety, this study introduces location-based volatility (LBV), which
quantifies variability in instantaneous driving decisions at intersections. LBV represents
the driving performance of connected vehicle drivers traveling through a specific
intersection. As such, by using coefficient of variation as a standardized measure of relative
dispersion, LBVs are calculated for 116 intersections in Ann Arbor. To quantify
relationships between intersection-specific volatilities and crash frequencies, rigorous
fixed- and random-parameter Poisson regression models are estimated. While controlling
for exposure related factors, the results provide evidence of statistically significant (5%
level) positive association between intersection-specific volatility and crash frequencies
for signalized intersections. The implications of the findings for proactive intersection
safety management are discussed in the paper.
Keywords: Proactive Safety, Driving Volatility, Crash Waiting to Happen, Connected Vehicles,
Basic Safety Messages, Big Data, Fixed and Random Parameters, Poisson Regression
INTRODUCTION
There is considerable evidence about vehicle conflicts at intersections resulting in crashes, making
them among the most dangerous locations on roadways (1; 2). Traditionally, intersection safety
evaluations are done based on historical data and they are largely reactive i.e. the state-of-the-art
methods characterize unsafe intersections based on historical and expected crash frequencies (2;
3). Safety treatments can then be applied to intersections based on historical crash data
methodology. Variability in instantaneous driving behaviors can be leading indicators of
occurrence of unsafe outcomes such as crashes/incidents. In this study, we posit that expanding
the concept of driving volatility (4-6) to specific locations (termed as Location-Based Volatility)
by using real-world large-scale connected vehicle data has a significant potential in unveiling
critical relationships between extreme driving behaviors (and its fluctuations) and safety outcomes
at specific intersections.
The Safety Pilot Model Deployment (SPMD) offers detailed and relevant data. This pilot is
underway in Ann Arbor, Michigan, intended to demonstrate vehicle-to-vehicle (V2V) and vehicle-
to-infrastructure (V2I) communication in a real-world environment. Within SPMD, Basic Safety
Messages (BSMs) contain rich information packets (exchanged at the frequency of 10 Hz) that
describe a vehicle’s position, motion, its component status, and other relevant information
Kamrani, Wali, & Khattak 3
exchanged between vehicles/infrastructure through V2V and V2I applications (7). Such emerging
data has been used for creating trip-based driving volatilities for drivers, capable of identifying
abnormal or extreme behaviors prior to unsafe outcomes such as crashes/incidents (6). Important
in this aspect is the concept of “driving volatility” that captures the extent of variations in driving,
especially hard accelerations/braking, jerky maneuvers, and frequent switching between different
driving regimes (4). Specifically, Wang et al. (5) and Liu and Khattak (6) examined the
relationships between trip-based driving volatility and several factors such as demographics, trip
purpose, duration, and detailed vehicle characteristics (5; 6). The potential of driver-specific trip-
based volatilities for developing advanced traveler information systems, driving feedback devices,
and alternative fuel vehicle purchase decision tools were concluded (5; 6).
This study focuses on developing an analytic methodology to examine instantaneous driving
behaviors at specific locations, and its variability. The paper explores how variability in driving
can be mapped to historical safety outcomes such as crashes at specific locations. Such an analysis
is fundamental towards proactive intersection safety management.
LITERATURE REVIEW
There are different branches of ongoing research topics in the connected vehicles (CV) area.
Several major directions of research can be identified. Topics such as network robustness and
information propagation efficiency (8) are still under investigation in order to establish a better
vehicle to vehicle (V2V) and vehicle to infrastructure (V2I) connection (8). Another is the systems
and algorithms whose ultimate goal are the reduction of the gap between vehicles in order to
increase roads capacity and reduction in fuel consumption through different method such as speed
harmonization (9), trajectory optimization (10) and platooning as discussed in Bergenhem et al.
(11). Also, there are a number of studies (not necessarily in CV area) trying to characterize
aggressive, reckless or risky driving style (12). Among them, speed limits are usually the threshold
that determines a driver’s performance (13; 14). While characterizing driver’s performance, the
important finding is that risky driving behaviors have been found to be positively correlated with
the likelihood of crashes or near-crash events (15). This said, a broad spectrum of studies related
to connected vehicle systems have proposed mechanisms for warnings or alerts to drivers using
the CV applications and their effect on safety. For instance, Chrysler et al. (16) investigated the
effect of warning messages on drivers’ ability to handle primary and secondary threats. The results
showed an improved detection time for the primary threat while increased reaction time to the
secondary threat which was placed after the primary threat. In another study (17), the impacts of
dynamic route guidance on work zone safety under different market penetration of CV were
explored. Per the interesting results, 40% penetration of CV and below improves safety while
above that leads to decreased safety of work zones. However, these benefits are dependent on the
information dissemination delay (18). Although, positive effects of warning messages have been
investigated, the way those warning should be created from BSMs is still under explored.
One approach is trying to link the generation of warning messages to drivers’ behavior. In
some recent studies, the authors have initiated efforts to extract useful information from BSMs to
understand the drivers’ behavior. For instance, a measure of driving performance in connected
vehicles network has been defined as “driving volatility” (19). As such, trip-based driving volatility
was introduced (19) to account for the variation of driving behaviors under different conditions
using objective driving performance evaluation matrix i.e. vehicular jerk. More succinctly, Liu et
al. (20) studied extreme driving behaviors (trip-based volatility) using exhaustive high frequency
connected vehicle data, and the analysis demonstrated framework for the generation of
Kamrani, Wali, & Khattak 4
warnings/alerts for connected vehicles informing drivers about potential hazards. Also another
study (21) proposed a way to identify abnormal or extreme behaviors (i.e., hard acceleration and
decelerations) from BSMs, and warn drivers through the V2V, V2I, or other connected vehicle
applications. In this paper, the authors believe that expanding the concept of driving volatility in
connected vehicles environment to specific locations has significant potential in identifying
hazardous roadway segments. Such a perspective of location-specific driving behavior in
connected vehicle systems has not been identified and analyzed. Therefore, this paper is aimed at
developing the new concept of location-based driving volatility (LBV) via using BSMs exchanged
between connected vehicles in real-world and linking it to historical crash data with the purpose
of identifying hazardous spots proactively. Although the novelty of this study is in using high
volume and high velocity connected vehicle data, the significance of works done by other
researchers on crash frequency cannot be overlooked, given the emergence of new approaches,
e.g., see Lord & Mannering (22). Also random parameter and/or varying coefficient models have
become popular as opposed to fixed parameter for their capability to address unobserved
heterogeneity (23-25).
Research Objective and Contribution
The objectives of this study are to:
1) Quantify instantaneous driving decisions and its variability in intersection-specific Basic
Safety Messages (BSMs).
2) Understand the relationship between intersection-specific volatility with crash
frequencies, while controlling for other variables, using rigorous statistical tools.
The present study contributes by analyzing real-world large-scale connected vehicle data
to extract critical driving behavior information embedded in raw BSMs. Such an analysis is
important because driving actions and behaviors are believed to be the main cause of traffic
crashes, and understanding the relationship between location-based volatility and historical crash
data can provide fundamental knowledge regarding proactive safety countermeasures. A unique
aspect of this study is that significant efforts have been undertaken to integrate large-scale
connected vehicle data (more than 65 million BSMs) with intersection crash and inventory data
in order to provide providing a more complete picture of operations and safety performance of
intersections. The assembled database allows investigation of correlations between potentially
leading indicator of safety (location-based volatility) and historical crash frequencies. By taking
the first step towards proactive safety using large-scale connected vehicle data, the current study
is original and timely in sense that real-world data has been processed and used to understand the
phenomena under discussion.
METHODOLOGY
Conceptual Framework
The two-month connected vehicle data from Safety Pilot Model Deployment (SPMD)
(https://www.its-rde.net/home) contains rich information (i.e., basic safety messages in 10 Hz) that
was exchanged between vehicles/infrastructure through vehicle-to-vehicle (V2V) and vehicle-to-
infrastructure (V2I) applications. Such data provide us with an opportunity to scrutinize the
mechanisms that lead to unsafe events on roadways. However, the methods of making a good use
of such high-volume and high-resolution data need further development. SPMD collects Basic
Safety Messages (BSMs) that describe a vehicle’s position, motion, its component status, and other
relevant travel information (26). However, BSMs are not informative to drivers when they need to
make decisions based on information received through V2V or V2I applications. Most BSMs
Kamrani, Wali, & Khattak 5
describe normal driver behaviors while abnormal and highly fluctuating driver behaviors
determine the safety of driving in the short-term.
This study is focused on developing an innovative methodology for estimating location-
based volatility for specific intersections and comparing it with their observed crashes. We
hypothesized that the nature of extreme instantaneous driving behaviors at intersections can be
correlated with their crash history. Such correlations can help us understand instantaneous driving
behaviors and how they relate to transportation safety. Location-based volatility (LBV) represents
the driving performance of a substantial number of users traveling through a specific location.
LBV may play a critical role in highway safety management, as it will highlight locations where
many drivers behave differently from other locations. Proactive countermeasures can be
considered in such locations. If many drivers make extreme driving behaviors or if driving
behaviors are highly fluctuating at certain locations, the reasons of such extreme behaviors may
be related to factors such as the road conditions. Such information can be disseminated to
connected vehicle drivers through roadside equipment (RSE) which are able to send information
to vehicles, and thus drivers may be alerted about potential hazards (e.g. conflicts/intersection sight
distance) while traveling through certain intersections.
First, the connected vehicle data consisting of geo codes and longitudinal acceleration were
cleaned. In the next step, 116 intersections were identified in Ann Arbor, Michigan (discussed later).
Crash data along with other geometric elements (provided in Table 1) were collected. Then, four
different coefficients of variation () are calculated and used as measures of
location-based volatility (LBV) for each intersection (150 ft. from the center of each intersection).
Given the hypothesis that higher LBV is likely to be positively correlated with historical crash data
at intersections, appropriate statistical models are developed to investigate the correlation between
LBV (among other traffic exposure factors) and crash frequency. The knowledge generated from
the modeling results can identify intersections where drivers, on average, show higher volatility in
their instantaneous driving decisions (e.g. longitudinal acceleration), and where such volatilities
are found to be correlated with crash frequency. By carefully analyzing high-resolution real-world
data transmitted between connected vehicles and applying appropriate statistical methods, we can
ultimately generate proactive (rather than the traditionally reactive safety approach) alerts and
warnings given to vehicle drivers at intersections. Such proactive warning and alerts can be
disseminated through roadside equipment to vehicles approaching specific intersections to warn
them regarding the chance or ranking of intersection in terms of crash occurrence. In the next
section, the computation of LBV is discussed.
Location Based Volatility
Understanding instantaneous driving volatility at specific intersections is one of the most
challenging aspects of the current study. To calculate location-based volatility, different
instantaneous driving measures can be used such as accelerations, steering angles or position of
brakes (6). As explicitly discussed in Liu and Khattak (6), volatility in trip-based instantaneous
driving decisions should be captured by considering both longitudinal and lateral accelerations.
Considering longitudinal acceleration as the only measure of driving volatility can mask important
information embedded in instantaneous driving data. For instance, at moments longitudinal
acceleration can be low and thus considered normal, but the driver could still be volatile due to
large magnitudes of lateral accelerations.
To calculate LBV, the authors intended to use longitudinal and lateral acceleration as they
are direct outcomes of vehicle maneuvering. However, due to a considerable amount of
questionable lateral acceleration data (see Data Accuracy section), only longitudinal acceleration
data were used. The longitudinal acceleration data is reasonable and available for all BSMs and
Kamrani, Wali, & Khattak 6
has been error checked by estimating accelerations from speed trajectories of the vehicles. Given
the data limitation, this study only focuses on capturing location-based volatility by using
longitudinal accelerations. There are two reason for this decision: First, excluding lateral
acceleration does not seem to be affecting the results drastically since lateral acceleration is more
informative in trip based volatility calculation where curvature of the road changes and where the
length of the trip allows several lane changes. Second, using the data with removed lateral
acceleration reduces the amount of data for several intersections leading to reduction of sample
size i.e. number of intersections.
Calculation of LBV
The present study uses a standardized measure of dispersion called Coefficient of Variation ()
(also known as the ratio of relative standard deviation) for quantifying the fluctuations in
longitudinal acceleration /decelerations at a specific intersection. Note that different measures such
as range, interquartile range, variance or standard deviation can be used for capturing variability
in longitudinal accelerations. Although standard deviation and variance are preferable as whole
information embedded in the data is used for calculation of variability, both measures are
insensitive to magnitude of acceleration values in the data. Thus, we prefer the relative measure of
dispersion (Coefficient of Variation), where the dispersion in accelerations or decelerations can be
quantified as the proportion of their means. This approach can capture the variability (e.g. standard
deviation) in instantaneous driving decisions with respect to the mean accelerations or
decelerations undertaken by different drivers at a specific intersection.
To compute volatility for each intersection, two speed bins (see Figure 1a), one from
minimum observed speed to the mean and one from the mean to maximum speed were considered.
The rationale behind considering speed bins is that the acceleration capability of a vehicle depends
on current vehicle speed i.e. at larger speeds the capability to accelerate decrease as compared to
acceleration capability at lower speeds. For each bin within an intersection, acceleration and
deceleration values are separated, and the means and standard deviations are computed. Finally,
as a measure of LBV is obtained by dividing standard deviations of accelerations to the mean,
i.e.,

. For each intersection, four s are reported as shown in Figure 1a. The calculated s for
a specific intersection provide the relative measure of dispersion of longitudinal accelerations with
respect to their means, and thus different intersections can be compared based on their s.
Modeling Approach
After quantification of volatility for each intersection, we investigate the correlations between
location-based volatility (for each intersection), crash data, and other traffic related factors.
Appropriate modeling can provide an empirical evidence as of how intersection location-based
volatility relates to historical crash data. Given the count nature of crashes, Poisson and/or Poisson-
gamma models (Negative Binomial) can be estimated depending on the mean and variance of crash
data. For a Poisson model, the probability of having a specific number of crashes “n” at
intersection “i” can be written as (27):

(1)
Where: is probability of crash occurring at intersection i”, n times per specific
time-period; and is Poisson parameter for intersection i” which is numerically equivalent to
intersection “iexpected crash frequency per year. The regression can be fitted to crash data
Kamrani, Wali, & Khattak 7
by specifying as a function of explanatory variables such as location-based volatility, Annual
Average Daily Traffic, and speed limits on major and minor approach. Formally, can be viewed
as a log link function of a set of independent variables (27):

(2)
Where is a vector of explanatory variables and is a vector of estimable parameters.
Application of Poisson regression to over-dispersed crash data can result in inappropriate
results. If mean and variance of crash data are not equal, corrective measures are applied to
Equation 2 by adding an independently distributed error term . While presence of over-dispersion
can be indicated by the mean and variance of crash data (27), formally a Lagrange multiplier can
be performed to statistically test the existence of over- dispersion in Poisson model (27). The test
statistic is defined as:
 


(3)
Where: are actual crash frequency for intersection “i”, is expected crash frequency for
intersection “i as predicted by Poisson model, and are number of observations. The null
hypothesis is that Poisson regression is appropriate for the crash data at hand. Under this hypothesis,
the LM test statistic should have chi-square distribution with degree of freedom equal one. If the
asymptotic chi-square distribution obtained from Equation 3 is less than critical chi-square of 3.84
at 95% level of confidence, Poisson regression should be favored, otherwise Negative Binomial
regression can be more appropriate (27).
Finally, it is likely that the associations between key explanatory variables and crash
frequency may not be consistent across intersections. The intrinsic unobserved heterogeneity can
arise due to several observed and unobserved factors related to intersection crash frequency, which
may not be available in the data at hand. This is referred to omitted variable bias in safety literature
(27). Furthermore, if key variables are omitted from analysis and too few variables are included in
the model, it is likely that location-based volatility (explanatory factor) can capture those effects
and may not be the true association between location-based volatility and crash frequency. One
way to address this issue is to allow parameter estimates to vary across observations (27). As such,
random parameters can be included in the estimation framework as:
(4)
Where is randomly distributed term with any pre-specified distribution such as normal
distribution with mean zero and variance.With Equation 4, the Poisson parameter in Equation
2 becomes:

(5)
And, the Poisson parameter in Equation 2 in Poisson-Gamma model becomes:

(6)
Finally, the following likelihood function for random-parameter model can be maximized
through maximum simulated likelihood technique (23):
 

(7)
Kamrani, Wali, & Khattak 8
Where: g(.) is the probability density function of randomly distributed term with pre-
specified distribution such as normal distribution with mean zero and variance More details on
random parameter models can be found in (23).
DATA
The data used in this study (retrieved from https://www.its-rde.net/home) are BSMs from vehicles
participating the SPMD in Ann Arbor, Michigan. SPMD is a comprehensive data collection effort,
under real-world conditions, at Ann Arbor test site with multimodal traffic hosting approximately
3,000 connected vehicles equipped with V2V and V2I communication devices. BSMs are
frequently transmitted messages (usually at 10Hz) that is meant to increase vehicle’s situational
awareness. At its core, the dataset contains vehicle’s instantaneous driving statuses of vehicle’s
position (latitude, longitude, and elevation) and motion (heading, speed, accelerations).
To examine correlations, location-based volatility (LBV) data for each intersection (as
explained earlier) are linked with historical crash data, annual average daily traffic (AADT) data
for major and minor approaches, speed limits on major and minor approaches, and number of
approaches at each intersection. Such data are publicly available at the website of the Metropolitan
Planning Organization: http://semcog.org/Data-and-Maps. Out of all intersections in the Ann
Arbor area, 116 intersections are identified for which connected vehicle data are available, i.e.
connected vehicles pass through such intersections and generating enough data for calculation of
LBV. Finally, five years of crashes (2011-2015) along with geometric factors and flows were
extracted and linked to LBV for each intersection. Note that the data are not available in
spreadsheet format, and thus significant efforts went into carefully extracting data manually and
linking it to LBV for 116 intersections.
Data Accuracy
Based on the distributions of key variables provided in Table 1, the data seems to be of reasonable
quality. To assure the accuracy of intersection data, after initial collection, another person checked
10% of intersection data randomly and no discrepancies were observed. Also, the descriptive
statistics of intersection data in Table 1 provide reasonable difference between signalized and un-
signalized intersections. The major inaccuracy of data is from the lateral acceleration as it is shown
in Figure 1b. Since 27,240,788 data points (42% of the data) had the maximum allowable value
that can be recorded in DSRC devices (2g), lateral acceleration data was not used in the analysis.
(FIGURE 1 HERE)
RESULTS
Descriptive Statistics
Table 1 presents the descriptive statistics of key variables used in modeling. The mean, standard
deviation, minimum and maximum values are given for each variable which can help
conceptualizing the distributions. Descriptive statistics are given for all the intersections (N=116)
as well as for signalized intersections (N=53) and un-signalized intersections (N=63) separately.
For all intersections, signalized, and un-signalized intersections, the mean five-year crash
frequency is 7.56, 12.94, and 3.04. As expected, signalized intersections have significantly higher
crash frequency (on average) than un-signalized intersections. This finding is in agreement with
Abdel Aty and Keller (28) who found approximately 9.6 crashes per year at signalized
intersections as opposed to only 2 crashes per year on un-signalized intersections (28). There can
be several factors which may contribute to occurrence of crashes at signalized intersections such
Kamrani, Wali, & Khattak 9
as conflicting movements as well as different intersection-specific design variables (28). This said,
investigating instantaneous driving actions at such locations, and higher volatility (if any) may
help us design appropriate proactive strategies from preventing an “accident waiting to happen”
(29). Regarding location-based volatility, all statistics suggest that signalized intersections on
average have higher variability in longitudinal accelerations/decelerations compared with
unsignalized intersections, and thus can be more volatile (this is the case for all s except ).
One reason for higher  (volatility of acceleration above mean speed) of un-signalized
intersections as compared to signalized intersections can be due to uninterrupted traffic of un-
signalized intersections.
In order to avoid omitted variable bias in modeling (30), data on other variables such as five-
year average AADT (major and minor approach), speed limits (major and minor approaches), and
number of approaches were collected. Regarding the number of approaches, 40% of all
intersections, 62.2% of signalized intersections, and 22% of un-signalized intersections are four-
legged intersections (Table 1). In terms of exposure on major and minor roads, signalized
intersections have higher (on average) AADT than un-signalized intersections (22,747 vs. 19,171
for major roads and 9,994 vs. 8,893 for minor roads). Regarding number of lanes, number of
through and left turns for signalized intersection are considerably higher as compared to un-
signalized intersections.
(TABLE 1 HERE)
Modeling Results
For examining the correlations between crash frequency and location-based volatility (as
measured by s), count data models are estimated given the count nature of crash frequency.
Separate count data regression models are estimated for all intersections, signalized intersections
and un-signalized intersections. Specifically, fixed-parameter Poisson regressions are estimated
for total crash frequency as a function of location based volatility, major and minor road AADT,
major and minor road speed limits, and total number of through lanes. It should be noted that the
descriptive statistics for crash frequencies in Table 1 apparently reveal the existence of over-
dispersion in the data where Negative Binomial model should be preferred over Poisson model
(31).Thus, statistical tests are conducted to confirm the existence of over-dispersion (27). As
explained in methodology section, Lagrange Multiplier tests were conducted for all three Poisson
models. By using Equation 3, the Lagrange Multiplier (LM) values were 0.05, 0.031, and 0.15 for
all intersections, signalized intersections, and un-signalized intersections respectively. The LM
values are much smaller than critical Chi-square value of 3.84 for one degree of freedom at 95%
confidence level. Thus, the null hypothesis that Poisson regressions are more appropriate is failed
to reject, and it would be more appropriate to use Poisson regressions (31).
Due to the likely presence of unobserved heterogeneity in crash data (23) which may arise
due to several unobserved factors, random-parameter Poisson models are also estimated. Fixed
parameter models are estimated with standard maximum likelihood whereas random parameter
models are estimated through simulated maximum likelihood with 200 Halton draws used for
random-held parameters (23). Regarding functional form of random-parameters, log-normal,
Weibull, uniform, and triangular distributions are tested with normally distributed random
parameters giving the best fit and shown in this study.
The results obtained from fixed and random parameter Poisson model are presented in Table
2. Marginal effects are also provided for the random parameter models that translate unit change
Kamrani, Wali, & Khattak 10
in crash frequency with unit change in explanatory variable. Compared to fixed-parameter models,
random-parameter models resulted in better fit as of improved log-likelihood at convergence and
McFadden’s (Table 2) (31). While this study does not focus on methodological approaches for
modeling intersection crash data, the predicted vs. actual values of crashes (Figure 2) are plotted
and reveal statistical superiority of random parameter models in fitting the data at hand.
Discussion
Coming to the fixed-parameter estimation results for all intersections (Table 2), the results provide
evidence that  ,  , and  are positively associated (statistically significant at 95%
confidence level) with crash frequency. However,  is negatively associated with crash
frequency (at 90% confidence interval). It can be concluded, overall, volatility of deceleration
regardless of speed range is positively associated with crash frequency. However, when it comes
to acceleration, volatility at lower speed is more a significant factor as compared to volatility at
higher speeds.
At signalized intersections, the association between  ,  and  and crash
frequency is also positive and statistically significant.  for signalized intersection; however, it
is negatively correlated with crash frequency.
Referring to marginal effects for random parameter model in Table 2, on average one-
percent increase in  is associated with 0.11 increase in crash frequency for all intersections
and 0.089 increase in crash frequency for signalized intersections. These findings have
implications for proactive intersection-related safety strategies. In addition, it is interesting to note
the significantly higher marginal effect of acceleration s for signalized intersections, implying
that higher variability in acceleration at signalized intersections may potentially result in more
crashes. Given that signalized intersections are typically observed to have more crashes (28),
proactive intersection-customized strategies can be designed. For instance, proactive warnings
and alerts can be generated about potential hazards at specific intersections and transmitted to
drivers via connected vehicle technologies such as road-side equipment. This can in turn increase
drivers’ situational and safety awareness, and help drivers in undertaking safer driving behaviors.
Regarding un-signalized intersections, as shown in Table 2,  and  are statistically
significant. We found negative association between  and crash frequency. This finding is
seemingly counter intuitive and needs further investigation. Possibly, for un-signalized
intersection, due to their uninterrupted traffic in major approach (78% of them are T-intersections),
separation of 3-leg and 4-leg intersection might shed more clarification in future studies. However,
the finding that  (Coefficient of variation of deceleration below mean speed of intersection)
is positively associated with crash frequency is intuitive i.e. larger the volatility/variation in
decelerations at low speeds, the more crash frequency at a particular intersection.
The estimation results quantify associations between major and minor road AADT and crash
frequency. Referring to marginal effects from the random-parameter model, one-log unit increase
in major road AADT is associated with 2.69, 6.57, and 1.82-unit increase in crash frequency for
all intersections, signalized intersections, and un-signalized intersections, respectively. Minor road
AADT is statistically significant in the random-parameter model for signalized intersections, but
the relationships are not statistically significant for un-signalized intersections (Table 2). Speed
limit on major roads is negatively associated with crash frequency for all intersections. These
findings are consistent with past studies on this topic (1; 32). Notably, the total number of through
lanes is positively associated with crash frequency. From Table 2, it can be observed that one
added through lane is correlated with 0.547 more crashes.
Figure 3 illustrates how the study results can assist in proactive intersection safety
Kamrani, Wali, & Khattak 11
management. The black, green and red circles in the figure are scaled crashes, volatility of
acceleration, and volatility of deceleration at lower speeds, respectively. The intersection in the
center is a known hotspot because it has more crashes and proportionately high levels of volatility.
However, two other intersections shown in dashed ellipses have relatively low crashes but high
volatility levels (. In such locations (hotspots), although crash frequencies are low,
drivers show proportionately more volatile driving behavior. In other words, at such locations
crashes may be waiting to happen. Proactive countermeasures can be taken in those locations
depending on the real cause of driving volatility, e.g., by studying speed limits, signal timing,
geometric design, dilemma zone, and lines of sight.
(FIGURE 2 HERE)
(FIGURE 3 HERE)
LIMITATIONS
The study captures variability in longitudinal acceleration/deceleration as a measure of
intersection-specific volatility, which only partially capture the true volatility exhibited by drivers.
As explained in the methodology section, due to data limitations, the study could not incorporate
lateral acceleration/deceleration in estimation of intersection-specific volatility. While the results
from this study provide evidence between crash frequency and intersection-specific volatility,
more robust measures such as vehicular jerk and combination of longitudinal and lateral
accelerations can be used in future studies for quantifying volatility at specific intersections. Also,
the results and conclusions of this study are dependent on the sample-size. Another limitation is
that one month data were used to explain 5-year average crash. While the current sample size may
not be enough to draw robust conclusions, the authors have used all available data for 116
intersections.
CONCLUSIONS
This study contributes by developing and demonstrating a proactive intersection safety
methodology using real-world large-scale connected vehicle data. The study quantifies volatility
in instantaneous driving decisions using intersection-specific Basic Safety Messages (BSMs) and
its relationship with observed crash frequencies, while controlling for other variables. Such a
method can complement the state-of-the-art in evaluating intersection safety, which is largely
reactive, based on observed and expected crash frequencies. The emerging data from Connected
and Automated (CAVs) are increasingly becoming available, which can help us understand the
detailed nature of instantaneous driving behaviors prior to the occurrence of unsafe outcomes such
as crashes/incidents. This study proposes the concept of location-based volatility that captures the
extent of variations in instantaneous driving decisions.
A unique database that provides a more complete picture of operations and safety
performance was created by combining more than 65 million Basic Safety Messages transmitted
between connected vehicles and roadside units at 116 intersections in Ann Arbor, Michigan, with
crash and inventory data. The geo-coded raw BSMs were allocated to each intersection and the
connected vehicles trajectories extracted from raw BSMs were plotted, revealing reasonable data
precision and coverage. A simple and standardized measure of dispersion called Coefficient of
Variation () (also known as the ratio of relative standard deviation) was used to quantify the
fluctuations in longitudinal acceleration and/or decelerations at specific intersections. Five-year
crash frequencies, AADT, speed limits, and number of approaches for all intersections are
Kamrani, Wali, & Khattak 12
extracted and linked with location-based volatilities. Significant efforts went into data processing,
collection, and linkage.
Rigorous fixed and random parameter Poisson regression models are estimated that allow
consideration of unobserved heterogeneity in crash data. The modeling results reveal that most of
computed s (as measures of volatilities) are positively associated with crash frequency. The
study has implications for proactive intersection safety management. Importantly, the magnitude
of association between location-based volatility and crash frequency is significantly higher for
signalized intersections, implying that higher variability in instantaneous driving decisions at
signalized intersections may potentially result in more crashes. This finding is important in the
sense that if many drivers behave in a volatile manner at a specific intersection (exhibit higher
variability in longitudinal accelerations), then such intersections can be identified before accidents
happen. Of course, the reasons for volatile behaviors may be related to intersection and
environmental conditions, vehicles’ and drivers’ conditions. Given that signalized intersections are
typically observed to have more crashes (28), intersection-customized strategies can be designed
to improve safety. Proactive warnings and alerts can be generated about potential hazards at
specific intersections and transmitted to drivers via connected vehicle technologies such as road-
side equipment; these can in turn increase drivers’ situational and safety awareness, and help them
pursue safer driving at dangerous intersections.
ACKNOWLEDGEMENT
This paper is based upon work supported by the US National Science Foundation under grant
No. 1538139. Additional support was provided by the US Department of Transportation through
the Collaborative Sciences Center for Road Safety, a consortium led by The University of North
Carolina at Chapel Hill in partnership with The University of Tennessee. Any opinions, findings,
and conclusions or recommendations expressed in this paper are those of the authors and do not
necessarily reflect the views of the sponsors.
REFERENCES
[1] Abdel-Aty, M., and K. Haleem. Analyzing angle crashes at unsignalized intersections using machine
learning techniques. Accident Analysis & Prevention, Vol. 43, No. 1, 2011, pp. 461-470.
[2] Persaud, B., and T. Nguyen. Disaggregate safety performance models for signalized intersections on
Ontario provincial roads. Transportation Research Record: Journal of the Transportation Research
Board, No. 1635, 1998, pp. 113-120.
[3] Kamrani, M., S. M. H. E. Abadi, and S. R. Golroudbary. Traffic simulation of two adjacent
unsignalized T-junctions during rush hours using Arena software. Simulation Modelling Practice and
Theory, Vol. 49, 2014, pp. 167-179. https://doi.org/10.1016/j.simpat.2014.09.006.
[4] Khattak, A., S. Nambisan, and S. Chakraborty. Study of Driving VOlatility in Connected and
Cooperative Vehicle Systems. National Science Foundation.In, 2015.
[5] Wang, X., A. J. Khattak, J. Liu, G. Masghati-Amoli, and S. Son. What is the level of volatility in
instantaneous driving decisions? Transportation Research Part C: Emerging Technologies, Vol. 58, 2015,
pp. 413-427.
[6] Liu, J., and A. J. Khattak. Delivering improved alerts, warnings, and control assistance using basic
safety messages transmitted between connected vehicles. Transportation Research Part C: Emerging
Technologies, Vol. 68, 2016, pp. 83-100.
[7] Henclewood, D. Safety Pilot Model Deployment
One Day Sample Data Environment Data
Handbook, 2014.
[8] Osman, O. A., and S. Ishak. A network level connectivity robustness measure for connected vehicle
Kamrani, Wali, & Khattak 13
environments. Transportation Research Part C: Emerging Technologies, Vol. 53, 2015, pp. 48-58.
[9] Ghiasi, A., J. Ma, F. Zhou, and X. Li. Speed Harmonization Algorithm Using Connected Autonomous
Vehicles. Presented at Transportation Research Board, 2017.
[10] Li, X., A. Ghiasi, and Z. Xu. Exact Method for a Simplified Trajectory Smoothing Problem with
Connected Automated Vehicles. Presented at Transportation Research Board, 2017.
[11] Bergenhem, C., S. Shladover, E. Coelingh, C. Englund, and S. Tsugawa. Overview of platooning
systems.In Proceedings of the 19th ITS World Congress, Oct 22-26, Vienna, Austria (2012), 2012.
[12] NHTSA. Resource Guide Describes Best Practices For Aggressive Driving Enforcement National
Highway Traffic Safety Administration, U.S. Department of Transportation, Washington, DC.
http://www.nhtsa.gov/About+NHTSA/Traffic+Techs/current/Resource+Guide+Describes+Best+Practices
+For+Aggressive+Driving+Enforcement. Accessed June 22nd, 2016.
[13] Haglund, M., and L. Åberg. Speed choice in relation to speed limit and influences from other drivers.
Transportation Research Part F: Traffic Psychology and Behaviour, Vol. 3, No. 1, 2000, pp. 39-51.
[14] Wali, B., A. Ahmed, S. Iqbal, and A. Hussain. Effectiveness of enforcement level of speed limit and
drink driving laws and associated factors Exploratory empirical analysis using a bivariate ordered probit
model. Journal of Traffic and Transportation Engineering (English Edition), 2017 (forthcoming).
[15] Paleti, R., N. Eluru, and C. R. Bhat. Examining the influence of aggressive driving behavior on
driver injury severity in traffic crashes. Accident Analysis & Prevention, Vol. 42, No. 6, 2010, pp. 1839-
1854.
[16] Chrysler, S. T., J. M. Cooper, and D. Marshall. The Cost of Warning of Unseen Threats: Unintended
Consequences of Connected Vehicle Alerts.In Transportation Research Board 94th Annual Meeting,
2015.
[17] Genders, W., and S. N. Razavi. Impact of connected vehicle on work zone network safety through
dynamic route guidance. Journal of Computing in Civil Engineering, Vol. 30, No. 2, 2015, p. 04015020.
[18] Du, L., and H. Dao. Information Dissemination Delay in Vehicle-to-Vehicle Communication
Networks in a Traffic Stream. Intelligent Transportation Systems, IEEE Transactions on, Vol. 16, No. 1,
2015, pp. 66-80.
[19] Wang, X., A. Khattak, J. Liu, G. Masghati-Amoli, and S. Son. What is the Level of Volatility in
Instantaneous Driving Decisions? Transportation Research Part C: Emerging Technologies, 2015.
10.1016/j.trc.2014.12.014.
[20] Liu, J., X. Wang, and A. Khattak. Generating Real-Time Driving Volatility Information. Presented at
2014 World Congress on Intelligent Transport Systems, Detroit, MI, 2014.
[21] Liu, J., and A. Khattak. Delivering Improved Alerts, Warnings, and Control Assistance Using Basic
Safety Messages Transmited between Connected Vehicles. . Transportation Research Part C: Emerging
Technologies, 2016.
[22] Lord, D., and F. Mannering. The statistical analysis of crash-frequency data: a review and assessment
of methodological alternatives. Transportation Research Part A: Policy and Practice, Vol. 44, No. 5,
2010, pp. 291-305.
[23] Anastasopoulos, P. C., and F. L. Mannering. A note on modeling vehicle accident frequencies with
random-parameters count models. Accident Analysis & Prevention, Vol. 41, No. 1, 2009, pp. 153-159.
[24] Li, X., A. J. Khattak, and B. Wali. Large-Scale Traffic Incident Duration Analysis: The Role of
Multi-agency Response and On-Scene Times. Transportation Research Record: Journal of the
Transportation Research Board, 2017 (forthcoming).
[25] Khattak, A. J., J. Liu, B. Wali, X. Li, and M. Ng. Modeling Traffic Incident Duration Using Quantile
Regression. Transportation Research Record: Journal of the Transportation Research Board, No. 2554,
2016, pp. 139-148.
[26] Henclewood, D. Safety Pilot Model Deployment One Day Sample Data Environment Data
Handbook.In, Research and Technology Innovation Administration, US Department of Transportation,
McLean, VA, 2014.
[27] Greene, W. H. Econometric analysis. Pearson Education India, 2003.
[28] Abdel-Aty, M., and J. Keller. Exploring the overall and specific crash severity levels at signalized
intersections. Accident Analysis & Prevention, Vol. 37, No. 3, 2005, pp. 417-425.
Kamrani, Wali, & Khattak 14
[29] Schneider, R. J., R. M. Ryznar, and A. J. Khattak. An accident waiting to happen: a spatial approach
to proactive pedestrian planning. Accident Analysis & Prevention, Vol. 36, No. 2, 2004, pp. 193-211.
[30] Mannering, F. L., and C. R. Bhat. Analytic methods in accident research: methodological frontier and
future directions. Analytic Methods in Accident Research, Vol. 1, 2014, pp. 1-22.
[31] Washington, S. P., M. G. Karlaftis, and F. Mannering. Statistical and econometric methods for
transportation data analysis. CRC press, 2010.
[32] Ye, X., R. M. Pendyala, S. P. Washington, K. Konduri, and J. Oh. A simultaneous equations model of
crash frequency by collision type for rural intersections. Safety Science, Vol. 47, No. 3, 2009, pp. 443-
452.
 
Kamrani, Wali, & Khattak 15
LIST OF TABLES
TABLE 1: Description of Key Variables and Descriptive Statistics
TABLE 2: Modeling Results of Fixed- and Random-Parameter Poisson Regressions
LIST OF FIGURES
FIGURE 1: a) Four quadrants used to calculate coefficients of variation (=

) for each
intersection, b) Plot of used data (left)/ Histogram of lateral acceleration (right)
FIGURE 2: Mean-expected over actual number of crashes for fixed and random-parameter
Poisson models (Green: fixed parameter models; Red: random parameter models)
FIGURE 3: Known hotspots and spots where crashes are waiting to happen.
Kamrani, Wali, & Khattak 16
TABLE 1 Description of Key Variables and Descriptive Statistics
All Intersections (N = 116)
Signalized (N = 53)
Un-signalized (N=63)
Mean
SD
Min/Max
Mean
SD
Min/Max
Mean
SD
Min/Max
(5 years)
7.56
7.64
0/44
12.94
8.03
1/44
3.04
2.95
0/14
4.28
4.56
0/24
7.07
5.24
1/24
1.93
1.79
0/9
143.71
56.03
69/239
182.44
57.58
83/329
111.13
26.12
69/191
84.9
13.76
56/121
77.93
12.7
59/113
90.77
11.8
57/121
137.51
43
71/287
168.67
41.15
87/287
111.29
21.94
71/181
96.29
12.9
57/155
99.44
14.86
76/155
93.64
10.39
57/115
20805
8326
3100/45400
22747
8209
3600/45400
19171
8131
3100/38900
9396
4138
1100/27400
9994
5706
3100/27400
8893
1972
1100/13400
9.84
0.49
8.03/10.72
9.96
0.39
8.18/10.72
9.74
0.54
8.03/10.56
9.05
0.47
7/10.21
9.07
0.52
8.03/10.21
9.03
0.42
7/9.50
35.34
7.24
25/45
35.94
7.34
25/45
34.84
7.18
25/45
30.47
3.95
25/45
30.84
5.16
25/45
30.15
2.53
25/40
0.4
0.49
0/1
0.622
0.489
0/1
0.22
0.41
0/1
4.45
1.28
2/8
5.13
1.35
2/8
3.38
0.9
2/6
1.53
1.32
0/6
2.26
1.4
0/6
0.92
0.88
0/3
0.93
0.78
0/4
1.11
1.01
0/4
0.79
0.48
0/2
Notes: : Coefficient of variation of acceleration below mean speed of intersection; :
Coefficient of variation of acceleration above mean speed of intersection;: Coefficient of
variation of deceleration below mean speed of intersection; : Coefficient of variation of
deceleration above mean speed of intersection; AADT: Annual Average Daily Traffic; SD is
standard deviation; Min is minimum value; Max is maximum value.
Kamrani, Wali, & Khattak 17
TABLE 2 Modeling Results of Fixed- and Random-Parameter Poisson Regressions
Variables
Signalized and Un-signalized
Signalized Intersections
Un-signalized Intersections
Fixed Par.
Random Par.
Fixed Par.
Random Par.
Fixed Par.
Random Par.

t-stat
t-stat
ME
t-stat
t-stat
ME
t-stat
t-stat
ME
Constant
-7.752
-6.6
-7.786
-7.237
---
-7.21
-4.975
-7.35
-6.958
---
-10
-3.574
-9.61
-3.235
---
Standard deviation*
---
---
---
---
---
---
---
---
---
---
---
---
0.488
6.155
---

0.006
4.152
0.004
2.902
0.025
0.009
3.434
0.01
5.346
0.125
-0.014
-2.831
-0.016
-2.911
-0.035
Standard deviation
---
---
---
---
---
---
---
0.0002
1.991
---
---
---
---
---
---

-0.003
-0.776
-0.007
-1.983
-0.038
0.009
1.453
0.01
1.959
0.118
0.005
0.683
0.004
1.28
0.01
Standard deviation
---
---
0.005
11.856
---
---
---
---
---
---
---
---
---
---
---

0.002
1.243
0.005
2.827
0.027
-0.003
-1.541
-0.004
-2.222
-0.057
0.015
2.698
0.0153
3.186
0.036
Standard deviation
---
---
---
---
---
---
---
0.0009
4.363
---
---
---
---
---
---

0.02
6.449
0.021
6.33
0.11
0.008
1.872
0.007
1.981
0.089
-0.0007
-0.09
0.0001
0.05
0.0004
Standard deviation
---
---
0.0007
2.182
---
---
---
---
---
---
---
---
---
---
---
Ln (Major Road AADT)
0.547
4.899
0.527
5.322
2.694
0.55
3.716
0.565
5.561
6.575
0.866
4.801
0.757
4.106
1.823
Standard deviation
---
---
0.011
3.376
---
---
---
---
---
---
---
---
0.488
6.155
---
Ln (Minor Road AADT)
0.123
1.656
0.15
1.97
0.767
0.191
2.083
0.207
2.03
2.413
0.231
1.004
0.292
1.25
0.704
Standard deviation
---
---
0.006
2.152
---
---
---
---
---
---
---
---
---
---
---
Speed limit major road
-0.009
-1.736
-0.014
-2.497
-0.073
0.004
0.576
0.008
1.227
0.097
---
---
---
---
---
Speed limit minor road
---
---
---
---
---
-0.016
-1.444
-0.023
-1.62
-0.271
---
---
---
---
---
Total through lanes
0.61
1.733
0.107
3.223
0.547
---
---
---
---
---
---
---
---
---
---
Summary Statistics
Log-lik. at Zero L(0)
-578.31
-578.31
-226.73
-226.73
-158.18
-158.18
Log-lik. at Convergence L(
)
-336.72
-305.02
-159.43
-154.91
-138.26
-130.44
McFadden
2
0.417
0.831
0.31
0.893
0.125
0.59
Sample Size (N)
116
53
63
Notes: ME: Average Marginal Effects from Random Parameter Model. : Coefficient of variation of acceleration below mean speed of intersection;
: Coefficient of variation of acceleration above mean speed of intersection; : Coefficient of variation of deceleration below mean speed of
intersection; : Coefficient of variation of deceleration above mean speed of intersection; AADT: Annual Average Daily Traffic; *Standard deviation
of normally distributed random parameters.
Kamrani, Wali, & Khattak 18
FIGURE 1: a) Four quadrants used to calculate coefficients of variation (=

) for
each intersection, b) Plot of used data (left)/ Histogram of lateral acceleration (right)
a)
b)
Kamrani, Wali, & Khattak 19
FIGURE 2: Mean-expected over actual number of crashes for fixed and random-parameter
Poisson models (Green: fixed parameter models; Red: random parameter models)
Kamrani, Wali, & Khattak 20
FIGURE 3: Known hotspots and spots where crashes are waiting to happen.
... (2) Improve trafc efciency. CAV can improve trafc efciency because of the advanced sensing and communication equipment in the vehicle networking system [4]. Lee et al. used VISSIM simulation to obtain a result that the network total network time is saved by 16% and the network average speed is improved by 15.7%, under the condition of 30% CAV penetration [5]. ...
... through the coordinated deployment of CACC and road variable speed limit strategies [6]. (4) Reduce energy consumption. Gawron et al. found that the functions in the CAV-enabled smart transportation system, such as energy-saving driving, feet driving, and smart intersection, can save about 9% energy consumption [7]. ...
... We set the parameters of the ACO algorithm in Table 4. As to the parameters m, α, β, ρ, and Q, the reference ranges are [1,10], [1,4], [0, 5], [0.1, 0.5], and [100, 500], respectively. Table 4 shows the relative feasible input of the parameters after several tests. ...
Article
Full-text available
If dedicate a lane to connected autonomous vehicle (CAV) on a multilane road, the traffic congestion and safety risks remain a major problem but in a different style. Random and disorderly mandatory lane-changing behaviour before approaching the next ramp or intersection would have a disturbing effect on the following vehicles of the traffic flow. This paper mainly establishes the optimal mandatory lane-changing location matching model for each target vehicle in the dedicated CAV lane environment. The aim is to minimizing the total travel time, which could take the disturbing effect into account. This model nests the cell transmission model (CTM) to describe vehicle running. The constraints include the relation between target CAV lane-changing cell and the corresponding behaviour start time, the updating of the flow, and occupancy for varied cells. We use the Ant Colony Optimization (ACO) algorithm to solve the problem. Through the case study of a basic two-lane road scenario in Ningbo, we acquire the convergence results based on the ACO algorithm. Our optimal lane-changing location matching scheme can save 5.9% total travel time when compared to the near-end location lane-changing scheme. We test our model by increasing the total number of upstream input vehicles with 4%, 11%, 15%, and the mandatory lane-changing vehicles with 60%, 200%, respectively. The testing results prove that out optimization method could deal with varied road traffic flow situations. Specifically, when the traffics and mandatory lane-changing vehicles increase, our method could perform better.
... The traditional approach to safety analysis has relied historically on physical infrastructure, crash data, manual data collection, and usually inferential statistical modeling to evaluate the safety of the road networks [3,4]. Although this approach has seen a lot of success over the years, there is still much to be desired regarding more proactive approaches to traffic safety [4,[7][8][9]. In recent years, studies have demonstrated that the advancement of commercially available and inexpensive real-time disaggregated vehicle data has the potential to be utilized to develop real-time crash prediction models [3,4,10], yet what is missing is a referenceable data processing and modelling framework of how the currently available data, and the potential variables that can be extracted from them in their current form, can be used to develop deployable detection and prediction models of safety critical situations, which this study seeks to provide. ...
... With regard to traffic safety analysis and the implementation of connected vehicle data, studies have looked at developing surrogate safety metrics from the collective analysis of different parameters extracted from connected vehicle data, such as hard accelerations and hard decelerations as well as vehicular jerk [8]. These metrics have come to be collectively known as driving volatility and have proven to be an especially useful proxy for crash risk situations. ...
... The roadway segment links are then ordered and indexed by virtue of their connectedness and direction of traffic flow. A clear spatial and temporal relationship can be observed between these two variables ( Figure 4) strengthening the prior hypothesis that hard decelerations can serve as a suitable surrogate for crashes, if necessary, as is consistent with the literature [8,34,38]. This relationship is further explored in another study that utilized an entropy based localized bivariate analysis to define the spatial and temporal relationship between hard decelerations and crash hotspots, concluding on an observed positive linear relationship in 63.21 percent of the coverage area of the study region, as well as a concave relationship in 20.37 percent and convex relationship in 14.23 percent of the study region [39]. ...
Article
Full-text available
Assessment of roadway safety in real-time is a necessary component for providing proactive safety countermeasures to ensure the continued safety and efficiency of roadways. A framework for utilizing data from connected vehicles and other probe sources is proposed in this study. Connected vehicles present an opportunity to provide live fingerprinting and activity monitoring on roadways. Taking advantage of high-resolution trajectory data streaming directly from connected vehicles, variables are extracted and the relationship with crashes are explored utilizing statistical and machine learning models. Hard acceleration events, in conjunction with segment miles are shown to have strong positive correlations with historical crash outcomes as proven by OLS, Poisson and Gradient Booster regression models. An XGBoost classification model is then trained to predict the real-time instances of crash outcomes at 5 min temporal bins with high levels of accuracy when trained with data including the real-time segment speed, reference speed, segment miles, a segment crash risk factor and other variables related to the difference in speeds between consecutive segments as well as the hour of the day. A weighted ensemble model achieved the best performance with an accuracy of 0.95. The results present evidence that the framework can capitalize on the richness of data available via connected vehicles and is implementable as a component in Advanced Traffic Management Systems for the analysis of safety critical situations in real-time.
... Nevertheless, some studies have identified risky or aggressive driving behaviors based on the magnitude of acceleration, braking, and steering employed by the driver (8)(9)(10)(11)(12). These risky driving behaviors, in turn, have been found to be positively correlated with the likelihood of a crash or near-crash (13)(14)(15)(16)(17)(18). Braking behavior has also been found to be significantly associated with traffic safety. ...
Article
Police-reported crash data have been the de facto element used by the transportation agencies in developing and implementing traffic safety projects. This approach is reactive in nature and can lead to suboptimal investment decisions owing to inherent challenges in crash data analysis. Because of their large-scale and near real-time availability, connected vehicle (CV) driving event data have emerged as a promising means of addressing these challenges. This study utilized CV event data for three different event types, namely, acceleration, braking, and cornering at three severity levels (easy, normal, and harsh), to examine the viability of using these data in traffic safety analysis. The results showed a strong correlation between crash frequency and CV driving event frequency. CV event data also improved the goodness-of-fit of crash frequency models. The results also showed that the relationship between CV driving events and traffic volume and roadway geometric data were generally consistent with the trends that crash data usually exhibit with the same predictors. This was true at both segment level and individual event level, as well as when the data were examined across different event/crash types. Overall, the results showed a strong case for these data to be used in traffic safety analyses as a complement to, or in lieu of, crash data.
... Previous studies (Arnaout and Bowling, 2014) have shown that CAVs have a smaller safe headway than HDVs, which can significantly improve road capacity. In recent years, with the development of wireless communication technology, the communication capability of CAVs has been further enhanced, which leads CAVs can communicate with each other (Levin and Boyles, 2016;Kamrani et al., 2017). As a result, CAVs had been rapidly developed and applied since vehicles were equipped with sensory and communication systems. ...
Article
Full-text available
Traffic flow will be mixed with connected automated vehicles (CAVs) and human-driven vehicles (HDVs) in the future. The randomness of the spatial distribution of different types of vehicles (i.e., CAVs and HDVs) will not be conducive to the stability and safety of traffic flow, leading to the deterioration of traffic capacity. Therefore, reasonable organization and management of the spatial distribution of vehicles in mixed traffic flow are significant for improving the performance of transportation systems. To effectively organize CAVs and realize the management of automated dedicated lanes, this paper proposes a mixed capacity and lane management model considering platoon size and intensity of CAVs. Firstly, the spatial distribution of different headway types is calculated based on a Markov chain model. Secondly, a single-lane capacity model is developed based on the headway distribution. Then, we analyze the sensibility of the model's parameters, including market penetration rates, platooning intensity, and platoon size of CAVs. Finally, we investigate the relationship between traffic capacity and lane management. Numerical analyses illustrate that the single-lane capacity is improved by increasing the market penetration rate, platoon size, and platooning intensity of CAVs. Moreover, The insight of the lane management model indicates that optimal lane management is associated with the market penetration rate of CAVs. These findings provide a strategy for the operation and management of dedicated lanes of CAVs in the future.
... Past research has developed and used several driving volatility measures for speed, lateral acceleration, longitudinal acceleration, and vehicular jerk (Arvin et al., 2019a;Kamrani et al., 2017Kamrani et al., , 2018Wali et al., 2018a;Miaou et al., 2005) to assess variation in driving movement. Likewise, this study applies volatility functions to velocity, lateral acceleration, and longitudinal acceleration. ...
Article
About 40 percent of motor vehicle crashes in the US are related to intersections. To deal with such crashes, Safety Performance Functions (SPFs) are vital elements of the predictive methods used in the Highway Safety Manual. The predictions of crash frequencies and potential reductions due to countermeasures are based on exposure and geometric variables. However, the role of driving behavior factors, e.g., hard accelerations and declarations at intersections, which can lead to crashes, are not explicitly treated in SPFs. One way to capture driving behavior is to harness connected vehicle data and quantify performance at intersections in terms of driving volatility measures, i.e., rapid changes in speed and acceleration. According to recent studies, driving volatility is typically associated with higher risk and safety-critical events and can serve as a surrogate for driving behavior. This study incorporates driving volatility measures in the development of SPFs for four-leg signalized intersections. The Safety Pilot Model Deployment (SPMD) data containing over 125 million Basic Safety Messages generated by over 2,800 connected vehicles are harnessed and linked with the crash, traffic, and geometric data belonging to 102 signalized intersections in Ann Arbor, Michigan. The results show that including driving volatility measures in SPFs can reduce model bias and significantly enhances the models' goodness-of-fit and predictive performance. Technically, the best results were obtained by applying Bayesian hierarchical Negative Binomial Models, which account for spatial correlation between signalized intersections. The results of this study have implications for practitioners and transportation agencies about incorporating driving behavior factors in the development of SPFs for greater accuracy and measures that can potentially reduce volatile driving.
... The number of vehicle interaction data according to traffic flow characteristics is 1,485,918 in high-density conditions and 675,357 in low-density conditions. The crash risk data were divided into input and output data generated as continuous time series data by the TSPs as shown in (2). TSPs affect the prediction accuracy when a short-term prediction model is constructed using time series traffic data. ...
Article
Full-text available
The availability of vehicle interaction data, which is obtained by an in-vehicle forward collision warning system, including spacing between the leading and the following vehicle and time-to-collision, provides a valuable opportunity to predict crash risks in real time. When this opportunity is combined with connected vehicle technologies including vehicle-to-vehicle wireless communications, it is expected that more effective crash prevention would be achievable by providing predictive warning information as a part of proactive traffic safety management (PTSM). The purpose of this study is to develop a more reliable in-vehicle warning information provision strategy based on the prediction of crash risks using vehicle interaction data. A crash risk prediction model based on a long short-term memory was able to predict the crash risk after 3 seconds with a mean absolute percentage error of 8% using the data for the past 5 seconds. The predicted crash risk data were applied to derive the optimal threshold for triggering in-vehicle warning information, which is the essence of the proposed warning provision strategy. This study defined three indicators to evaluate the reliability of warning information: correct detection rate (CDR), detection failure rate (DFR), and information provision rate (IPR). An exemplar analysis result showed that the optimal threshold to minimize IPR in a situation where CDR and DFR are 100% and 0%, respectively, was identified as 0.69. The proposed methodology that predicts crash risks in real time and provides V2V-based warning information in a more proactive manner is expected to mitigate the crash risk significantly.
Article
The coronavirus disease 2019 (COVID-19) pandemic led to a substantial reduction in activity-travel resulting from a shift in the primary purpose of travel from work/study to shopping and essential trips. Further, several researchers noted changes to roadway safety during the early phases of the pandemic, such as a decrease in traffic crashes but an increase in traffic fatalities with respect to vehicle miles traveled. However, the existing literature is limited to holistic trends, with little inference to microscopic driving styles. This research utilizes a paneled approach to compare the freeway behavior of select passenger car drivers with respect to changes in driving volatility stemming from the lockdown period and one year into the COVID-19 pandemic. The methodological design employs a rich paneled dataset obtained from the connected vehicles pilot study in Tampa, Florida. Crash data visualization is also performed to identify crash hotspots within the study area. Volatility measures such as the standard deviation, coefficient of variation, and time-varying stochastic volatility (TVSV) are then generated and fused with traffic and weather information. Hierarchical modeling is then performed using a panel approach to account for unobserved heterogeneity within multiple observations per individual driver. The results show that the series of jerk-related driving volatility measures and the TVSV of speed increase one year into the pandemic, suggesting a reduction in overall roadway safety along the study segment. Based on the observed individual volatility changes, policy recommendations such as enhanced traffic enforcement guidelines, data-driven safety campaigns, and systematic implementation of lockdown protocols were identified.
Article
The Tampa Hillsborough Expressway Authority Connected Vehicle Pilot Deployment (THEA CV Pilot) implemented several vehicle-to-vehicle (V2V) and vehicle to infrastructure (V2I) applications on more than 1,000 private vehicles. This paper focuses on the Forward Collision Warning (FCW) application to study factors that are associated with drivers’ reactions to FCWs and to investigate if the observed driving styles derived from the data support the participants’ stated driving styles obtained from their survey responses. A panel of participants, driving in real-world traffic conditions for over two years with retrofitted CV technology and integrated FCW application, is used. The panel consists of a treatment (Human Machine Interface (HMI) enabled) and a control (HMI disabled) group. Random parameters logit and correlated grouped random parameter logit models are estimated to reveal possible associations between stated and observed driving behavior, HMI exposure, socio-demographic factors, and the response variable (drivers’ reaction to FCW). The study found an association between one measure of driving volatility, so that with increased driving volatility (proxy for driving aggressiveness), the probability of reaction to FCW declines. The study also found that the probability of reaction for drivers who received a warning (audiovisual) via HMI increased by 9.93 % compared to those who did not receive a warning.
Article
Full-text available
The contemporary traffic safety research comprises little information on quantifying the simultaneous association between drink driving and speeding among fatally injured drivers. Potential correlation between driver’s drink driving and speeding behavior poses a substantial methodological concern which needs investigation. This study therefore focused on investigating the simultaneous impact of socioeconomic factors, fatalities, vehicle ownership, health services and highway agency road safety policies on enforcement levels of speed limit and drink driving laws. The effectiveness of enforcement levels of speed limit and drink driving laws has been investigated through development of bivariate ordered probit model using data extricated from WHO’s global status report on road safety in 2013.The consistent and intuitive parameter estimates along with statistically significant correlation between response outcomes validates the statistical supremacy of bivariate ordered probit model. The results revealed that fatalities per thousand registered vehicles, hospital beds per hundred thousand population and road safety policies are associated with a likely medium or high effectiveness of enforcement levels of speed limit and drink driving laws respectively. Also, the model encapsulates the effect of several other agency related variables and socio-economic status on the response outcomes. Marginal effects are reported for analyzing the impact of such factors on intermediate categories of response outcomes. The results of this study are expected to provide necessary insights to elemental enforcement programs. Also, marginal effects of explanatory variables may provide useful directions for formulating effective policy countermeasures for overcoming driver’s speeding and drink driving behavior.
Article
Full-text available
Traffic incidents often known as non-recurring events impose enormous economic and social costs. Compared to short duration incidents, large-scale incidents can substantially disrupt traffic flows by blocking lanes on highways for long periods of time. A careful examination of large-scale incidents and associated factors can assist with actionable large-scale incident management strategies. For such an analysis, a unique and comprehensive 5-year incident database on East Tennessee roadways was assembled to conduct in-depth investigation of large-scale incidents, especially focusing on operational responses (e.g., response and On-Scene times) by various agencies. Incidents longer than 120 minutes and block at least one lane are considered large-scale, giving 890 incidents, which are about 0.69% of all reported incidents in the database. Rigorous fixed- and random-parameter hazard-based duration models are estimated to account for the possibility of unobserved heterogeneity in large-scale incidents. The modeling results reveal significant heterogeneity in associations between operational responses and large-scale incident durations. A 30-minute increase in response time for first, second, and third (or more) highway response units translates to 2.8, 1.6, and 4.2 percent increase in large-scale incident durations, respectively. In addition, longer response times for towing and highway patrol are also significantly associated with longer incident durations. Given large-scale incidents, associated factors also include vehicle fire, unscheduled roadwork weekdays, afternoon peaks, AADT, etc., however the magnitude of associations is heterogeneous, i.e., the direction can be positive in some cases and negative in other cases. Practical implications of results for large-scale incident management are discussed.
Conference Paper
Full-text available
Freeway bottlenecks often cause traffic capacity drops and speed oscillations that not only compromise traffic performance at the bottlenecks but also likely propagate far backward to break down the upstream traffic. The adverse impacts of bottlenecks include more travel delay, excessive fuel consumption and emissions, and extra safety risks. With the advent of connected and automated vehicles (CAV) technologies, we can control detailed vehicle trajectory shapes within sensing, communication, computation, and physical limits. In this study, a CAV-based trajectory-smoothing concept is proposed to harmonize traffic and improve mobility and environmental impacts. The presented algorithm is applicable to mixed-traffic environments where only a portion of vehicles are CAVs. Simulation analyses are performed to assess the algorithm performance. The results show significant improvement in traffic throughput as well as in fuel consumption and emissions.
Article
Full-text available
Traffic incidents occur frequently on urban roadways and cause incident induced congestion. Predicting incident duration is a key step in managing these events. Ordinary least squares (OLS) regression models can be estimated to relate the mean of incident duration data with its correlates. Because of the presence of larger incidents, duration distributions are often right-skewed; that is, the OLS model underpredicts the durations of larger incidents. Therefore, this study applies a modeling technique known as quantile regression to predict more accurately the skewed distribution of incident durations. Quantile regression estimates the relationships between correlates and a chosen percentile—for example, the 75th or 95th percentile—while the OLS regression is based on the mean of incident duration. With the use of incident data related to more than 85,000 (2013 to 2015) incidents for highways in the Hampton Roads area of Virginia, quantile regression results indicate that the magnitudes of parameters and predictions can be quite different compared with OLS regression. In addition to predicting durations of larger incidents more accurately, quantile regressions can estimate the probability of an incident lasting for a specific duration; for example, incidents involving congestion and delay have an approximately 25% chance of lasting more than 100.8 min, while incidents excluding congestion and delay are estimated to have a 25% chance of lasting more than 43.3 min. Such information is helpful in accurately predicting durations and developing potential applications for using quantile regressions for better traffic incident management.
Conference Paper
Full-text available
Drivers make their short-term steering and speed decisions based on incoming information from several sources. In order to navigate through the transportation network they adjust their speeds and change lanes, exhibiting substantial variation in driving tasks during trips undertaken within urban areas. Large variations in accelerations and decelerations can be associated with fuel wastage, greater emissions, safety problems, and avoidable wear and tear on brakes and the engine. To reduce costs, this study explores how a profile of driving variations, especially hard accelerations and braking obtained using smart devices, can be used to generate actionable warnings and alerts. Hard accelerations or braking occurs when a driver applies greater than “normal” pressure on their accelerator or brake. As a result, vehicles experience higher levels of accelerations or decelerations. To provide drivers with real-time driving warnings, their instantaneous decisions can be monitored and analyzed. This study develops a fundamental understanding of their instantaneous driving decisions. It quantifies their driving style and captures their level of volatility during driving. Empirical analysis is based on a large-scale travel behavior survey, containing 51,370 trips and their associated second-by-second (total 36 million seconds) Global Positioning System (GPS) data from Atlanta, GA collected during 2011. It shows how acceleration and braking monitoring can be done to enhance generate alerts. Outlier driving patterns are the key to generating actionable volatility information. Results from rigorous statistical modeling reveal that driving volatility varies significantly between driver groups and it is highly associated with trip attributes, e.g., time of day, trip durations and trip lengths. The implications of the findings and potential applications to fleet vehicles and driving population are discussed.
Article
Full-text available
Vehicle-to-vehicle (V2V) communication networks, as one of the core components of connected vehicle systems, have been granted many promising applications to address traffic mobility, safety, and sustainability. However, only a limited amount of work has been completed to understand the fundamental properties of information propagation in such systems, while comprehensively considering traffic and communication reality. Motivated by this view, this proposed research develops analytical formulations to estimate information propagation time delay via a V2V communication network formed on a one-way or two-way road segment with multiple lanes. Distinguished to previous efforts, the proposed study carefully involves several critical communication and traffic flow features in reality, such as wireless communication interference, intermittent information transmission, and dynamic traffic flow. Moreover, this study elaborately analyzes the interactions between information and traffic flow under sparse and congested traffic flow conditions. The numerical experiments based on Next-Generation Simulation field data illustrate that the proposed analytical formulations are able to provide very good estimation, with the relative error less than 5%, for the information propagation time delay on a one-way or two-way road segment under various traffic conditions. The proposed work can be further extended to characterize information propagation time delay and coverage over local transportation networks.
Conference Paper
This paper formulates a simplied trajectory smoothing model for guiding movements of connected automated vehicles on a general one-lane highway segment. This simplied problem constructs each vehicle trajectory as a piece-wise quadratic function with no more than ve pieces and lets all trajectories share identical acceleration and deceleration rates. This paper investigates theoretical properties of this problem structure and proposes an exact analytical algorithm that very eciently solves the optimal solution. Numerical example are conducted to illustrate how to apply this model to CAV control problems on signalized segments and at non-stop intersections.
Article
When vehicles share their status information with other vehicles or the infrastructure, driving actions can be planned better, hazards can be identified sooner, and safer responses to hazards are possible. The Safety Pilot Model Deployment (SPMD) is underway in Ann Arbor, Michigan; the purpose is to demonstrate connected technologies in a real-world environment. The core data transmitted through Vehicle-to-Vehicle and Vehicle-to-Infrastructure (or V2V and V2I) applications are called Basic Safety Messages (BSMs), which are transmitted typically at a frequency of 10 Hz. BSMs describe a vehicle’s position (latitude, longitude, and elevation) and motion (heading, speed, and acceleration). This study proposes a data analytic methodology to extract critical information from raw BSM data available from SPMD. A total of 968,522 records of basic safety messages, gathered from 155 trips made by 49 vehicles, was analyzed. The information extracted from BSM data captured extreme driving events such as hard accelerations and braking. This information can be provided to drivers, giving them instantaneous feedback about dangers in surrounding roadway environments; it can also provide control assistance. While extracting critical information from BSMs, this study offers a fundamental understanding of instantaneous driving decisions. Longitudinal and lateral accelerations included in BSMs were specifically investigated. Varying distributions of instantaneous longitudinal and lateral accelerations are quantified. Based on the distributions, the study created a framework for generating alerts/warnings, and control assistance from extreme events, transmittable through V2V and V2I applications. Models were estimated to untangle the correlates of extreme events. The implications of the findings and applications to connected vehicles are discussed in this paper. *Note*: Abstract will be updated to be consistent with the final version.
Article
Despite enhanced safety strategies, in-vehicles technologies, and improvements in infrastructure, urban transportation networks are still accident-prone. Connected vehicle offers the possibility to exchange data with vehicles and infrastructure in an effort to improve safety. The main objective of the research reported in this paper is to evaluate the potential safety benefits of deploying a connected vehicle system on a traffic network in the presence of a work zone. The modeled connected vehicle system in the research reported in this paper uses vehicle-to-vehicle (VTV) communication to share information about work zone links and link travel times. Vehicles which receive work zone information will also modify their driving behavior by increasing awareness and decreasing aggressiveness. This paper also proposes a decaying average travel time dynamic route guidance algorithm which exhibits weighted information decay. Traffic microsimulation software is used to model the network and a C plugin is developed to implement connected vehicle in the simulation. The surrogate safety measure improved time to collision (TTC) is used to assess the safety of the network. Various market penetrations of connected vehicles were utilized along with three different behavior models to account for the uncertainty in driver response to connected vehicle information. The results show that network safety is strongly correlated with the behavior model used; conservative models yield conservative changes in network safety. The results also show that market penetrations of connected vehicles under 40% contribute to a safer traffic network, while market penetrations above 40% decrease network safety. The findings of the research reported in this paper indicate connected vehicle technology can have unintended consequences, as seen in decreased safety at high market penetrations, requiring researchers to develop additional applications to mitigate these effects.