ArticlePDF Available

The effect of zonal factors in estimating crash risks by transportation modes: Motor vehicle, bicycle and pedestrian

Authors:

Abstract

Objectives: This paper aimed to (i) differentiate the effects of contributory factors on crash risks related to different transportation modes, i.e., motor vehicle, bicycle and pedestrian; (ii) explore the potential contribution of zone-level factors which are traditionally excluded or omitted, so as to track the source of heterogeneous effects of certain risk factors in crash-frequency models by different modes. Methods: Two analytical methods, i.e. negative binomial models (NB) and random parameters negative binomial models (RPNB), were employed to relate crash frequencies of different transportation modes to a variety of risk factors at intersections. Five years of crash data, traffic volume, geometric design as well as macroscopic variables at traffic analysis zone (TAZ) level for 279 intersections were used for analysis as a case study. Results: Among the findings are: (1) the sets of significant variables in crash-frequency analysis differed for different transportation modes; (2) omission of macroscopic variables would result in biased parameters estimation and incorrect inferences; (3) the zonal factors (macroscopic factors) considered played a more important role in elevating the model performance for non-motorized than motor-vehicle crashes; (4) a relatively smaller buffer width to extract macroscopic factors surrounding the intersection yielded better estimations.
1
1
Research highlights 2
This paper investigates factors associated with crash occurrences by different 3
transportation modes. 4
Zonal factors have significant effects on intersection crashes. 5
Zonal factors contribute to tracking the source of heterogeneity of risk factors. 6
Zonal factors play a more important role in elevating the model performance 7
for non-motorized than motor-vehicle crashes. 8
A relatively smaller buffer width to extract zonal factors yields better 9
estimations. 10
11
12
2
The Effect of Zonal Factors in Estimating Crash Risks by Transportation 1
Modes: Motor Vehicle, Bicycle and Pedestrian 2
3
Jie WANG1, Helai HUANG1*, Qiang ZENG2 4
5
1 School of Traffic and Transportation Engineering, Central South University, Changsha, Hunan, China 6
2 School of Civil Engineering and Transportation, South China University of Technology, Guangzhou, 7
Guangdong, China 8
*Correspondence 9
E-mail address: jie_wang@csu.edu.cn (J. Wang), huanghelai@csu.edu.cn (H. Huang), 10
zengqiang@scut.edu.cn (Q. Zeng) 11
12
Abstract: Objectives: This paper aimed to (i) differentiate the effects of contributory factors on 13
crash risks related to different transportation modes, i.e., motor vehicle, bicycle and pedestrian; (ii) 14
explore the potential contribution of zone-level factors which are traditionally excluded or 15
omitted, so as to track the source of heterogeneous effects of certain risk factors in 16
crash-frequency models by different modes. Methods: Two analytical methods, i.e. negative 17
binomial models (NB) and random parameters negative binomial models (RPNB), were 18
employed to relate crash frequencies of different transportation modes to a variety of risk factors 19
at intersections. Five years of crash data, traffic volume, geometric design as well as macroscopic 20
variables at traffic analysis zone (TAZ) level for 279 intersections were used for analysis as a case 21
study. Results: Among the findings are: (1) the sets of significant variables in crash-frequency 22
analysis differed for different transportation modes; (2) omission of macroscopic variables would 23
result in biased parameters estimation and incorrect inferences; (3) the zonal factors (macroscopic 24
factors) considered played a more important role in elevating the model performance for 25
non-motorized than motor-vehicle crashes; (4) a relatively smaller buffer width to extract 26
macroscopic factors surrounding the intersection yielded better estimations. 2 7
Keywords: transportation modes; macroscopic variables; unobserved heterogeneity; buffer 28
width; intersection safety 29
30
1 Introduction 31
Many communities have increased their interest in the implementation of multimodal 32
transportation and advocated for the shift from motor vehicles to non-motorized modes of 33
transportation, i.e., walking and cycling. In spite of the health and environmental benefits, an 34
increasing number of crashes involving pedestrians and bicyclists has become a major concern in 35
improving traffic safety. For example in 2013, the United States had 4735 pedestrian and 743 36
bicyclist deaths, accounting for 18% of all U.S. highway fatalities (NHTSA, 2013). The Federal 3 7
Highway Administration’s office of safety has established pedestrian and bicyclist safety as one 38
of its top priorities. Thus, it is essential for traffic safety engineers to provide appropriate 39
countermeasures or policies to achieve friendly and safe multimodal transportation. 40
A comprehensive understanding of contributing factors associated with crash occurrences by 41
different modes is a prerequisite for developing safety improvement programs to effectively 42
reduce traffic crashes. For a given road entity (e.g. road segments or intersections), the potential 4 3
factors associated with multimodal crashes could be summarized as in Figure 1, according with 44
Miranda-Moreno et al. (2011), Mitra and Washington (2012), Ukkusuri et al. (2012) and Strauss et 45
al. (2003; 2014). The factors influencing road-entity-level crash frequency by modes include 46
macroscopic factors related to built environment of the road entities - such as population and 47
economic characteristics, land use characteristics and travel behaviors - as well as road features 48
3
and traffic characteristics of the road entities. In addition, crash occurrence is also associated with 1
individual characteristics such as gender, age, education, alcohol consumption, and other driver 2
and pedestrian behaviors (Ryb et al., 2007). Although discrete individual-level factors are not 3
available to be integrated into the crash-frequency model, individual characteristics are always 4
influenced by macroscopic factors (Christoffel and Gallagher, 1999). Therefore, macroscopic 5
factors could serve as a surrogate for individual behaviors. 6
7
8
Figure 1. Factors associated with multimodal crashes 9
10
The choice of appropriate analytical method and the selection of representative explanatory 11
variables are two important considerations for obtaining accurate model predictions. Over the 1 2
past three decades, considerable research efforts have been devoted to developing and applying 1 3
sophisticated methodological approaches associated with the analysis of crash frequency. 1 4
Detailed descriptions and assessments of crash-frequency models can be found in the review 15
papers by Lord and Mannering (2010) and Mannering and Bhat (2014). However, relatively few 1 6
studies have focused on the identification and inclusion of traditionally excluded or omitted 17
variables in crash-frequency analysis. In particular, variables related to macroscopic factors 18
previously described (in Figure 1) are normally unavailable in crash databases and as a result 1 9
have rarely been examined in great detail. Mitra and Washington (2012) is one of a few studies 20
exploring the omitted variables in crash-frequency modeling. The authors developed two 21
different models of estimating intersection crash frequency, one with traffic volume as the only 22
independent variable, and the other with several spatial factors in addition to commonly 23
included geometric design and traffic factors. Through contrastive analysis of the two models, 2 4
results indicated that some spatial factors, such as local influences of weather, sun glare, 2 5
proximity to drinking establishment, proximity to school and demographic attributes near 26
intersections, have significant explanatory power and their exclusion leads to biased estimates. 27
Statistical methods such as spatial and temporal correlation, multilevel, random effect, 28
random parameter, and latent class approaches have been developed to address this issue of 29
unobserved heterogeneity (Anastasopoulos and Mannering, 2009; Dong et al., 2016; Mannering et 3 0
al., 2016; Quddus, 2008; Wang and Huang, 2016; Xu and Huang, 2015; Xu et al. 2016), as these 3 1
omitted explanatory variables can be regarded as part of the unobserved heterogeneity. 3 2
Unobserved heterogeneity impacts traffic safety analysis in two ways: the first problem is that the 33
selected explanatory variables cannot fully account for the cross-section or longitudinal-section 3 4
variations in crash counts due to unobserved road geometrics, environmental factors, driver 35
behavior and other confounding factors, which lead to impaired predictive performance of the 36
model (called heterogeneity in model prediction); the second problem is that these unobserved 37
factors are always correlated with observed factors and thus biased parameters will be estimated 3 8
and incorrect inferences could be drawn (called heterogeneity in the coefficient estimator). While 3 9
these approaches will mitigate the adverse impacts of omitting significant explanatory variables, 40
the resulting model estimates still fail to track the original source of heterogeneity and quantify 41
the safety effect of omitted variables (such as macroscopic factors shown in Figure 1). Omission of 42
4
important explanatory variables still remains a problem even with advanced statistical 1
approaches to capture unobserved heterogeneity (Mannering et al., 2016). 2
The study by Mitra and Washington (2012) attempted to investigate the safety effect of some 3
important omitted variables on total crash frequency and their contribution on model estimation. 4
As Venkataraman et al. (2013) stated, frequency models of crash outcome type can provide 5
substantial insights into the effect of explanatory variables and assist in examining the 6
heterogeneity effects in roadway geometric features. This paper aims to extend previous research 7
(Mitra and Washington, 2012) and investigate how macroscopic factors affect the crash-frequency 8
analysis for different transportation modes. This is because there may be some inconsistent 9
impacts of some macroscopic variables on motor vehicle and non-motorized (including b icycle 10
and pedestrian) crashes. For example, Lee et al. (2015) utilized a multivariate model for 11
investigating motor vehicle and non-motorized crashes at the macroscopic level. Results for the 12
parameter estimation suggested that some zonal variables related to demographics and road 1 3
characteristics have different directional effects on motor vehicle and non-motorized crashes. 14
Meanwhile, the most appropriate width of buffer to extract macroscopic factors may be 15
inconsistent between modeling motor vehicle and non-motorized crashes. Therefore, it is 16
advisable to model the crash frequency by separate transportation modes to examine the effects 1 7
of macroscopic factors. 18
In summary the objective of this paper is twofold: (1) to examine the effects of a host of 19
contributing factors including both macroscopic and microscopic factors on crash occurrence 2 0
with respect to different transportation modes; (2) to shed further light on the contribution of the 21
macroscopic factors which are traditionally excluded or omitted variables, to tracking the source 22
of heterogeneity effects in coefficient estimators of regularly used variables and improving the 2 3
model performance in crash-frequency analysis related to different modes. 24
2 Data preparation 25
In this study, data collected for 279 intersections located in Hillsborough County, Florida, 26
USA were used to develop the intersection crash-frequency models for different transportation 27
modes. The data for the analysis was mainly divided into four types: traffic crash data, traffic 28
characteristics, road characteristics related to geometric design, traffic control/regulatory of the 2 9
intersection, macroscopic factors including trip production/attraction, demographic and 30
socio-economic characteristics surrounding the intersection. The derivation and processing of 31
these data sources are described next. 32
2.1 Crash data 33
Crash data for the intersections in a five-year period (2005–2009) were obtained from the 3 4
Florida Department of Transportation (FDOT) Crash Analysis Reporting (CAR) system. Crashes 35
were categorized as intersection related crashes if they occurred within the curb-line limits of 36
the intersection or if they occurred within the influence area of the intersection, which is 250 feet 37
away from the stop line. Intersection-level crash data was disaggregated into motor vehicle, 3 8
bicycle, and pedestrian crashes. A motor vehicle crash was defined as a collision between two or 39
more motor vehicles or between a motor vehicle and an object. A bicycle crash referred to a 40
collision between a motor vehicle and a bicycle. Likewise, a pedestrian crash denoted a collision 41
between a motor vehicle and a pedestrian. 42
2.2 Traffic characteristics 43
Previous researches suggest that traffic characteristics such as motor vehicle, pedestrian and 4 4
bicycle volume are the most important factors influencing crash occurrences. Motor vehicle 45
5
volume represented by average annual daily traffic (AADT) can be collected from the FDOT 1
Roadway Characteristics Inventory. Two motor vehicle volume variables including AADT from 2
major road and AADT from minor road of 5-year (2005–2009) average were also obtained. Actual 3
pedestrian and bicycle volume are not regularly available. The collected macroscopic data such as 4
population were used to serve as a surrogate for pedestrian and bicycle volume as suggested by 5
Jacobsen (2003) and Miranda-Moreno et al.(2011). 6
2.3 Road features 7
Road features related to geometric design and regulatory/control attributes of the road entity 8
were collected from the FDOT Roadway Characteristics Inventory. The road factors considered in 9
the study are number of legs, presence of traffic signal, speed limit on major approach, and speed 10
limit on minor approach. 11
2.4 Macroscopic factors 12
Considerable previous studies on zonal-level crash-frequency models suggests that various 13
macroscopic factors such as trip production/attraction, demographic and socio-economic 14
characteristics affect area-wide traffic crashes (Quddus., 2008; Huang et al., 2010, 2016; Abdel-Aty 15
et al., 2011; Xu et al., 2014; Dong et al., 2015, 2016). It is hypothesized that the number of crashes 16
occurring at an intersection is also associated with these macroscopic factors surrounding the 17
intersection. 18
Trip production/attraction factors such as total trip productions/attraction, home-based work 19
productions/attraction, college productions/attraction at the TAZ level were collected from the 20
Intermodal Systems Development Unit of District 7 of the FDOT. Demographic and 21
socio-economic characteristics were examined including the geographical area of each TAZ, 2 2
population, income and commuting, which were downloaded from the United States Census 23
report. 24
ArcGIS 10.0 was used to generate a buffer around each selected intersection and conduct a 25
spatial analysis to extract macroscopic data from the TAZ layers. The process is described in 26
detail as follows. First, a spatial overlay of TAZ layers on a specified width buffer (including 27
0.25 mile, 0.5 mile and 1 mile buffer) was generated around each intersection. Then, spatial 28
analysis operators such as “intersect” and “join” available in the GIS environment were used to 29
intersect layers, join tables and extract selected trip production/attraction, demographic and 30
socio-economic characteristics within the generated buffer. Macroscopic factors were distributed 3 1
in proportion to the area of TAZ within the generated buffer. The ArcGIS procedure adopted to 32
extract and estimate macroscopic factors in this study was similar to the one discussed in detail 33
by Pulugurtha and Sambhara (2011) to develop pedestrian crash estimated models. 34
Table 1 provides descriptive statistics of crash data, traffic variables, road variables and 35
macroscopic variables located in 0.5 mile buffer. The values of macroscopic variables located at 36
0.25 and 1 mile buffer are not listed for compactness of the table. 37
38
39
40
41
42
43
44
45
46
47
6
Table 1 Summary of variable and descriptive statistics 1
Variable Definition Meana SDa Mina Maxa
Crash data
Motor vehicle crash Motor vehicle crash per intersection in 2005-2009 65.219 56.545 2.000 293.00
Bicycle crash Bicycle crash per intersection in 2005-2009 1.018 1.340 0.000 8.000
Pedestrian crash Pedestrian crash per int ersection in 2005-2009 1.276 1.889 0.000 11.000
Traffic and road variables
AADT-major AADT on major approach (103pcu) 28.364 17.684 2.600 71.300
AADT-minor AADT on minor approach (103pcu) 9.150 8.879 1.000 43.000
Leg-number Number of legs (4 legs=1, 3 legs=0) 0.670 0.471 0.000 1.000
traffic signal Presence of traffic signal (yes=1,no=0) 0.498 0.542 0.000 4.000
Speed-Major Speed limit on major approach (mph) 40.502 6.154 6.154 60.000
Speed-Minor Speed limit on minor approach (mph) 35.323 6.479 6.479 55.000
Macroscopic variables
PA_density Density of productions and attractions (per acre) 49.272 27.369 2.514 184.06
HB_prop Proportion of home-based productions and
attractions 0.680 0.084 0.317 0.840
Col_prop Proportion of college productions and attractions 0.025 0.059 0.000 0.415
Pop_density Density of total population (per acre) 5.631 2.554 0.342 11.798
Age 0 to 15_ prop Proportion of population between age 0 and 15 0.225 0.050 0.069 0.348
Age 16 to 64_prop Proportion of population between age 16 and 64 0.652 0.054 0.561 0.917
Pub_prop Proportion of workers commuting by public
transportation 0.030 0.031 0.000 0.149
Wal_ prop Proportion of workers commuting by walking 0.025 0.017 0.000 0.080
MHINC Median household income (in thousands) 36.728 15.199 4.300 89.035
a These values relating to macroscopic variables are only for 0.5 mile buffer. 2
3 Methodology 3
In previous crash-frequency analyses, Poisson and Negative binomial model (NB), along
4
with their variants (such as Poisson-lognormal model), are commonly used and proven to be
5
successful as they effectively model the rare, random, sporadic, and non-negative crash data. As
6
crash data exhibit over-dispersion (i.e., variance greater than mean), NB is superior to the Poisson
7
model. Compared with the basic Poisson model, NB includes a gamma-distributed error term in
8
Poisson mean to account for the over-dispersion due to omission of relevant variables or
9
measurement error in crash data. The formulation for NB can be presented as follows:
10
~ Poisson
i i
Y
(1)
11
0
ln
i i i
 
X β (2)
12
Where
i
Y
is the crash frequency by modes (i.e., motor vehicle, bicycle and pedestrian) at
13
intersection
i
, and
i
is the expectation of
i
Y
.
i
X
is a vector of explanatory variables.
0
is the
14
intercept,
β
is a vector of estimable parameters.
i
is the error term that is assumed to be
15
independent
X
and has a two-parameter gamma distribution.
16
The NB model presented in Eq.(2) could control for unobserved heterogeneity by omitted
17
variables. However, this model assumes that the unobserved variables are uncorrelated with the
18
observed exploratory variables. If this correlation exists, unobserved factors can introduce
19
variation in the effect of observed variables on crash likelihood. Random parameters approaches
20
are able to address this issue by allowing non-constant estimable parameters to vary across
21
7
observations (Mannering et al., 2016). In random parameters negative binomial model (RPNB),
1
estimable parameters (
β
) in Eq.(2) can be written as:
2
i i
 
 
(3)
3
Where
is the mean of the random parameter
i
,
i
is a randomly distributed term (e.g.,
4
a normally distributed term with mean 0 and variance
2
) that capture heterogeneity across
5
observations. The analyst can test for random parameters with Eq.(3), across all observations i for
6
each included explanatory variable. If the variance of the chosen distribution is not significantly
7
different from zero, it suggests that a conventional fixed parameter is statistically appropriate.
8
Thus the model always combines fixed and random parameters across the included explanatory
9
variables.
10
As previously stated, heterogeneity effects of certain risk factors mainly derive from the 11
combined effects of unobserved variables that have been omitted from the model. Although the 12
random parameters approaches could mitigate the adverse impacts of omitting variables, the 1 3
original source of unobserved heterogeneity (or what are the major factors that lead to 14
unobserved heterogeneity) still fails to be well understood. Thus one of the aims of this study is 1 5
to test the potential role of macroscopic variables (referring to as the influential omitted variables) 16
in tracking the source of heterogeneity effects related to commonly used traffic and road 17
variables. 18
To this end, two different model specifications are estimated and compared, both with 19
random parameters approaches. In the base model only traffic and road variables are included, 20
while the second model traffic and road variables as well as macroscopic variables at TAZ level 2 1
surrounding the intersections are included. If the test results for parameter estimation on a 22
variable appear as random in base model (variance of the chosen distribution is significantly 23
different from zero), while this variable has the fixed effect in the second model. Then we could 24
infer that the heterogeneity effect of this variable is mainly caused by these macroscopic variables. 25
For another case, the safety effect of this variable is still random in second model; the source of 26
heterogeneity effect of this variable still cannot be clearly distinguished, maybe due to other 27
important omitted variables. 28
Apart from the potential role in accounting for the heterogeneity effects in parameters
29
estimation, integrating the macroscopic variables could also decrease the variance of the random
30
error (i.e. overdispersion) and thus improve the model performance in crash-frequency
31
prediction. The proportion of reduction in variance (PRV), also called explained variance,
32
proposed by Raudenbush and Bryk (2002) can be used to assess the overall explanatory power of
33
macroscopic factors for modeling the crashes by different modes. In this case, the PRV of
34
macroscopic variables is defined as:
35
2 2
0 1
2
0
-
PRV
 
(4)
36
Where
2
0
is the variance of the error term in the base model without macroscopic variables.
37
2
1
is the variance of error term in the full model with macroscopic variables. The value of PRV is
38
bounded by 0 and 1, and a higher value indicates a stronger explanatory power of macroscopic
39
factors on the crash occurrence.
40
Furthermore, two goodness-of-fit statistics are used for model comparisons: Akaike
41
Information Criterion (AIC) and log-likelihood ratio (LR).
42
The AIC is calculated as follows:
43
2 2
AIC LL p
 
(5)
44
Where LL is the log-likelihoods at convergence for the estimated model, and
p
is the
45
number of parameters in the statistical model. The model with the lower AIC is considered to
46
8
have the better goodness of fit.
1
The LR value is the chi-squared value in the log-likelihood ratio test for the null hypothesis
2
test that reveals whether or not the equivalence of two models should be rejected. The likelihood
3
ratio statistic is,
4
2( )
N A
LR LL LL
  (6)
5
which is
2
distributed with J degrees of freedom, where J = KA KN (KA and KN are the
6
number of coefficients for the alternative model and the null model, respectively ), LLN and LLA are
7
the log-likelihoods at convergence for the null model and the alternative model, respectively. The
8
null hypothesis for Eq. (6) is that the alternative model does not have a significantly lower
9
log-likelihood than the null models, indicating a lack of significant difference between the null
10
model and the alternative model.
11
4 Results and discussion 12
Three types of crash-frequency models for motor vehicles, bicycles and pedestrians were 13
developed. Each type of model involved eight separate models based on four model 1 4
specifications (one with only traffic volume and road features, the other three with macroscopic 1 5
factors overlaid on 0.25, 0.5, 1 mile buffer respectively in addition to commonly included traffic 1 6
volume and road features ) and two analytical methods (NB and RPNB). 17
LIMDEP econometric software was used to develop the statistical models described above. 18
To enable focus on the most significant variables, variables that were not found to significantly 19
different from zero at the 0.1 level of significant using a t-test were removed. Meanwhile, the 2 0
likelihood ratio test was used to guarantee that each added variable significantly improved the 2 1
overall model performance. In the RPNB, if the variance of a random parameter was not 22
statistically different from zero, the random parameter was simplified to be fixed across 23
intersections. Thus, the results in NB were in accordance with that in RPBN when no estimate 24
parameters of explanatory variable were statistically random. 25
This analysis below will emphasize testing effects of macroscopic factors on model 26
performance in crash-frequency analysis for three transportation modes, and then comparison 27
results of the parameter estimates and marginal effects between the base model and the full 28
model with macroscopic variables will be presented and interpreted. 29
4.1 Effects of macroscopic factors on model performance 30
Tables 2-4 show goodness-of-fit measures for motor vehicle, bicycle and pedestrian 3 1
crash-frequency models, respectively. As shown in Table 2, only five of eight motor vehicle 32
crash-frequency models, including two base models (NB model and RANB model) and three 33
fully specified NB models, were presented since there was no significant random parameter as 3 4
measured by the t-statistics in all three models with macroscopic variables. Although there was 35
no substantial difference in goodness-of-fit as reflected by likelihood ratio test between the base 36
NB and RPNB model, the present of significant random parameters ( e.g. the variable of ‘presence 37
of traffic signal’ in this case study) demonstrated the existent of heterogeneity of risk factors in 38
base model without considering macroscopic factors. More interestingly, no significant random 39
parameters were found in all three full models with macroscopic variables. This implied that the 40
heterogeneous effects of risk factors on motor vehicle crash frequency could be mostly captured 4 1
by these macroscopic variables, at least for the Hillsborough dataset examined here. Frequency 42
analysis models for bicycle crashes presented a similar result to motor vehicles crashes, as shown 43
in Table 3. However, this was not the case for the pedestrian crash-frequency models (Table 4). 44
Significant random parameters, such as ‘presence of traffic signal’, existed both in pedestrian 45
crash models with and without macroscopic variables, suggesting that the heterogeneity effect in 46
9
parameters estimation cannot be completely picked up by these macroscopic variables. 1
Apart from the potential effect in tracking the heterogeneity, results also revealed that 2
incorporating the macroscopic variables in crash-frequency analysis leaded to an increasing 3
model complexity but a considerable improvement in overall fit as measured by log likelihood at 4
convergence. As shown in Table2, the likelihood ratio test comparing the full NB models and the 5
base NB models indicated that we were more than 99.99 % confident that the full models with 6
macroscopic variables (except the full model with 1.0 mile buffer-width macroscopic variables) 7
were statistically superior. This comparison suggested that the macroscopic variables explained a 8
portion of variability in crash occurrences and should not be omitted in motor vehicle 9
crash-frequency model. In regard to bicycle and pedestrian models, omission of macroscopic 10
variables will also lead to a significant decrease in goodness-of-fit, as shown in Tables 3-4. 11
Comparing model outputs developed based on 0.25 mile, 0.5 mile and 1 mile buffer width 12
data, these models with macroscopic variables of 0.25 mile buffer width had the lowest AIC, 1 3
conversely, the models with macroscopic variables of 1 mile buffer width had the highest AIC in 14
all three types of crash-frequency models by modes. Thus, a relatively smaller buffer width in 15
extracting macroscopic factors surrounding the intersection would yield a better estimate. 16
To further assess and compare the overall explanatory power of macroscopic variables, the 17
values of PRVs were calculated. As shown in Table 2, the motor vehicle crash-frequency model 1 8
with macroscopic variables of 0.25 mile buffer width had the highest PRV of 7.98%. This meant 19
that 7.98% of unexplained variation resulted from those omitted macroscopic variables, which 20
also suggested the usefulness of the motor vehicle crash-frequency analysis by integrating 21
macroscopic factors. Accordingly, the highest values of PRV were 33.02% and 26.37% in bicycle 22
and pedestrian crash-frequency models respectively, as shown in Tables 3-4. 23
Table 2 Goodness-of-fit measures for motor vehicle crash-frequency models 24
Model statistics Base model 0.25 mile 0.50 mile 1 mile
NB RPNB NB NB NB
Number of observers 279 279 279 279 279
Number of parameters 6 7 10 10 10
Log likelihood at convergence -1280.07 -1279.42 -1269.25 -1270.86 -1276.28
AIC 2572.14 2572.84 2558.50 2561.73 2572.56
Log-likelihood ratio test
2 = -2(LLN-LLA)
1.300 21.645 18.416 7.588
Degrees of freedom
1 4 4 4
P-value
0.26 <0.01 <0.01 0.11
Explanatory power of macroscopic factors
Variance of the error t erm, 0.250 0.246 0.230 0.232 0.243
Proportion of reduction in variance, PRV
1.23% 7.98% 6.85% 2.67%
Note: LLN denotes the log likelihood at convergence for Base + NB model. 25
26
By comparing PRVs in models by different transportation modes, the PRVs in bicycle and 27
pedestrian crash-frequency model were much higher than that in motor vehicle models. In other 28
words, integrating macroscopic factors in non-motorized crash-frequency model was more vital 29
than that in developing motor vehicle model. This result was in line with the expectation. One 30
possible reason for this distinct effect is that pedestrian/bicycle volume (or pedestrian/bicycle 31
activity) which is commonly identified as the main determinants of pedestrian/ bicycle crash 32
frequency has been omitted in the base pedestrian/bicycle crash-frequency models. Integrating 33
macroscopic factors for pedestrian/bicycle crash-frequency analysis made up the absence of 34
10
pedestrian/bicycle volume in predicting pedestrian/bicycle crash frequencies to some extent as 1
demonstrated by previous study (Jacobsen, 2003; Miranda-Moreno et al., 2011) that macroscopic 2
data can serve as a surrogate for pedestrian and bicycle volume. Another reason, maybe even 3
more importantly, originates from the differences in the travel distance between non-motorized 4
and motor vehicle modes. As walking and bicycle are short-distance transportation modes, crash 5
victims of pedestrians and bicyclists generally reside near the crash intersection, and thus, the 6
macroscopic factors extracted surrounding the intersection can probably better reflect pedestrian 7
and/or bicyclist behaviors than that for motor drivers. 8
Table 3 Goodness-of-fit measures for bicycle crash-frequency models 9
Model statistics Base model 0.25 mile 0.50 mile 1 mile
NB RPNB NB NB NB
Number of observers 279 279 279 279 279
Number of parameters 6 7 8 8 8
Log likelihood at convergence -362.53 -361.61 -350.42 -351.14 -351.62
AIC 737.06 737.22 716.85 718.28 719.24
Log-likelihood ratio test
2 = -2(LLN-LLA)
1.837 24.209 22.777 21.821
Degrees of freedom
1 2 2 2
P-value
0.17 <0.01 <0.01 <0.01
Explanatory power of macroscopic factors
Variance of the error term 0.402 0.335 0.269 0.274 0.280
Proportion of reduction in variance, PRV
16.48% 33.02% 31.66% 30.27%
Note: LLN denotes the log likelihood at convergence for Base + NB model. 10
11
Table 4 Goodness-of-fit measures for pedestrian crash-frequency models 12
Model statistics Base model 0.25 mile 0.50 mile 1 mile
NB RPNB
NB RPNB
NB RPNB NB RPNB
Number of observers 279 279 279 279 279 279 279 279
Number of parameters 4 5 7 8 7 8 7 8
Log likelihood at convergence -399.51
-397.89
-386.34
-386.26
-387.27
-387.16 -388.52
-388.13
AIC 807.03
805.77
786.68
788.51
788.54
790.32 791.05
792.27
Log-likelihood ratio test
2 = -2(LLN-LLA)
3.256 26.352
26.515
24.487
24.712 21.983
22.761
Degrees of freedom
1 3 4 3 4 3 4
P-value
0.08 <0.01 <0.01 <0.01 <0.01 <0.01 <0.01
Explanatory power of macroscopic factors
Variance of the error term 0.785
0.613 0.585 0.578 0.595 0.587 0.601 0.595
Proportion of reduction in variance, PRV
21.96%
25.57%
26.37%
24.26%
25.21% 23.48%
24.18%
Note: LLN denotes the log likelihood at convergence for Base + NB model. 13
14
15
16
11
4.2 Parameter estimates and marginal effects 1
Three types of crash-frequency models for motor vehicles, bicycles and pedestrians were 2
estimated and each type involved eight separate models based on four model specifications and 3
two analytical methods, yielding a total of 24 models. Three full models which have a better 4
goodness-of-fit, including motor vehicle NB model with 0.25 mile buffer width macroscopic 5
variables, bicycle NB model with 0.25 mile buffer width macroscopic variables and pedestrian NB 6
model with 0.25 mile buffer width macroscopic variables, were selected as recommended models 7
for reasons outlined. Meanwhile, the parameter estimates of three base NB models for motor 8
vehicles, bicycles and pedestrians were presented for comparison. Tables 5-7 show the parameter 9
estimates and their t-statistics for motor vehicle, bicycles and pedestrians crash-frequency models, 1 0
respectively. Since the model has nonlinear coefficients, direct parameters will not show a unit 11
effect on the number of crashes. Table 8 thus summarizes the results for marginal effects, which 12
could be interpreted as the average impact of a unit change in an explanatory variable on crash 13
frequency. 14
Comparing marginal effects of microscopic variables (traffic volume and road variables) 15
between the full NB models and the base NB models revealed some important differences. The 16
major differences were in marginal estimates of the variable ‘presence of traffic signal.’ The 17
results in the base model using NB approach showed that ‘presence of traffic signal’ had positive 18
association with the crash frequency for all three transportation modes; while this variable 19
became no statistically significant for motor vehicle and bicycle crashes in the full NB models. 20
Meanwhile, the present of macroscopic variables also modified the marginal effects of 21
microscopic variables (see Table 8). For example, the marginal effects of ln(AADT-minor) were 22
6.38 and 0.14 respectively for motor vehicle and bicycle crashes in the base models, while these 23
values were 7.67 for motor vehicle crashes and 0.17 for bicycle crashes in models with 24
macroscopic variables. This difference clearly showed that in the absence of important 25
macroscopic variables, the marginal effects of ln(AADT-minor) for motor vehicle and bicycle 26
crashes were biased downwards by 16.8% and 17.6% respectively. These results agreed with the 27
safety research by Mitra and Washington (2012) that the exclusion of important variables may 2 8
cause bias in coefficient estimates and incorrect inferences. 29
According to the results of parameter estimates (Tables 5-7) and their marginal effects (Table 30
8), significant variable sets for crashes were not consistent for different transportation modes. 31
‘AADT on major approach’ and ‘density of total population’ were two contributing factors that 3 2
had statistically significant effects on the three response variables (i.e., motor vehicle, bicycle and 33
pedestrian crash frequency). Four variables including ’AADT on minor approach,’ ‘number of 34
legs’, ‘proportion of college productions and attractions’ and ‘proportion of workers commuting 35
by public transportation’ were significant for two response variables. Six variables were solely 36
associated with one response variable: ‘presence of traffic signal’, ‘speed limit on major 37
approach,’ ‘speed limit on minor approach,’ ‘proportion of home-based productions and 3 8
attractions,’ ‘proportion of population between age 16 and 64’ and ‘proportion of workers 39
commuting by walking’. The detailed interpretations for these significant risk factors are offered 40
in the following. 41
42
43
44
45
46
47
12
Table 5 Parameter estimates for motor vehicle crash-frequency models 1
Variables Base NB model Full NB model
Mean Standard Error
t-Statistic
Mean Standard Error t-Statistic
Ln(AADT-major)
0.728 0.038 19.13 0.728 0.042 17.30
Ln(AADT-minor)
0.097 0.041 2.35 0.117 0.043 2.74
Leg-number 0.451 0.066 6.83 0.414 0.068 6.11
traffic signal 0.159 0.058 2.75
Speed-Minor 0 .031 0.004 8.30 0.028 0.006 4.49
HB_prop
-2.015 0.513 -3.93
Col_prop
-2.324 0.673 -2.64
Pop_density
0.039 0.015 2.70
Pub_prop
-0.761 0.349 -2.14
Intercept 1.377 0.467 2.95
Note: all parameters are significant at the 0.1 level or better. 2
3
Table 6 Parameter estimates for bicycle crash-frequency models 4
Variables Base NB model Full NB model
Mean Standard Error
t-Statistic
Mean Standard Error t-Statistic
Ln(AADT-major)
0.490 0.130 3.77 0.580 0.127 4.58
Ln(AADT-minor)
0.132 0.085 1.55 0.165 0.086 1.92
Leg-number 0.369 0.171 2.15 0.464 0.173 2.69
traffic signal 0.312 0.135 2.32
Col_prop
-2.791 1.669 -1.67
Pop_density
0.082 0.031 2.64
Wal_ prop
10.638 3.782 2.81
intercept -2.304 0.412 -5.59 -3.283 0.447 -7.35
Note: all parameters are significant at the 0.1 level or better. 5
6
Table 7 Parameter estimates for pedestrian crash-frequency models 7
Variables Base NB model Full NB model
Mean Standard Error
t-Statistic
Mean Standard Error
t-Statistic
Ln(AADT-major) 0.877 0.161 5.45 0.924 0.162 5.69
traffic signal 0.732 0.154 4.75 0.522 0.156 3.35
Speed-Major -0.076 0.013 -5.91 -0.049 0.018 -2.76
Pop_density
0.090 0.031 2.88
Age 16 to 64_prop
-3.089 1.112 -2.78
Wal_ prop
11.211 3.809 2.94
Note: all parameters are significant at the 0.1 level or better. 8
9
10
11
13
Table 8 Estimate results for marginal effects of risk factors 1
Variables Motor vehicle Bicycle Pedestrian
Base Full Base Full Base Full
Ln(AADT-major) 48.00 47.65 0.50 0.59 1.15 1.19
Ln(AADT-minor) 6.38 7.67 0.14 0.17
Leg-number 26.60 24.48 0.35 0.43
traffic signal
10.47
0.32
0.96
0.67
Speed-Major
-0.10 -0.06
Speed-Minor
2.03
1.87
HB_prop
-131.94
Col_prop
-152.21
-2.85
Pop_density
2.58
0.08
0.12
Age 16 to 64_prop
-3.96
Pub_prop
-49.86
10.88
Wal_ prop
14.39
Note: all parameters are significant at the 0.1 level or better. 2
4.2.1 Traffic volume 3
Similar to numerous prior studies, traffic volumes are significant variables for intersection 4
crashes and are positively correlated with crash occurrence (Lee and Abdel-Aty, 2005; Mitra and 5
Washington, 2012; Xie et al., 2013). The marginal effects of ln(AADT-major) were 47.65 , 0.59 and 6
1.19 respectively for motor vehicle, bicycle and pedestrian crashes in the full models, indicating 7
that an average of thousand increase in major approach AADT will lead to a 2.31, 0.03 and 0.06 8
increase in motor vehicle ,bicycle and pedestrian crash frequency respectively. Similarly, an 9
average of thousand increase in minor approach AADT was associated with a 0.69 and 0.02 10
increase in motor vehicle and bicycle crashes. 11
4.2.2 Number of legs 12
The marginal effects of ‘number of legs’ on motor vehicle and bicycle crashes were 24.28 and 13
0.43. This result suggested that the four-legged intersection was associated with 24.28 more motor 14
vehicle crashes and 0.43 more bicycle crashes compared to the intersection with three legs. This 15
result was generally expected and agreed with the preliminary finding that a larger number of 16
legs may increase the likelihood of crash occurrence due to more potential conflicts (Zeng and 17
Huang, 2014). However, this variable did not found to have significant effects on pedestrian crash 18
frequency. This may be due to the higher design standards of facilities (such as marked 1 9
crosswalks) in the large-leg intersection that leads to mixed effects of ‘number of legs’ on 20
pedestrian safety. 21
4.2.3 Traffic signal 22
The effects of traffic signal on safety are very interesting. The results of parameter estimates 23
in the base model using NB approach showed that ‘presence of traffic signal’ had positive 24
association with all three target variables. This is not consistent with the empirical hypothesis that 25
the installation of traffic lights could improve intersection safety. In the base model using RPNB 26
approach, ‘presence of traffic signal’ had a positive effect on three target variables but with a 27
varying magnitude across intersections (the results for the RPNB model were not presented since 2 8
there was no significant improvement in goodness-of-fit compared to the NB model). More 29
interestingly, this variable became no statistically significant for both motor vehicle and bicycle 3 0
crashes in the models integrating macroscopic variables. The possible reason for this difference is 31
14
that the macroscopic variables could account for a positive and heterogeneity effect of the 1
‘presence of traffic signal’. This implies that the installation of traffic lights itself will not increase 2
the crash risk but traffic lights are always installed at relatively hazardous sites; however, more 3
work is needed to verify this conclusion. 4
4.2.4 Speed limit 5
The variables related to speed limit had inconsistent effects on crash occurrence by different 6
modes. ‘Speed limit on minor approach’ was positively correlated with the frequency of motor 7
vehicle crashes. An increase of 10 mph in speed limit on minor approach will increase motor 8
vehicle crashes by 18.7. This is generally expected since at high speeds the time to react to 9
changes in the environment is shorter, leading to higher crash frequency. However, ‘speed limit 10
on major approach’ was negatively associated with pedestrian crashes. The frequency of 11
pedestrian crashes will decrease by 0.6 with per 10 mph increase in speed limit on major 12
approach. The probable reason for this result is that a higher speed limit is always related with 13
higher design standards of facilities such as pedestrian overcrossing and underpass. The effect of 14
speed limit on bicycle crashes was not significant. 15
4.2.5 Trip characteristics 16
Results showed that the proportion of home-based productions and attractions near an 17
intersection was negatively associated with motor vehicle crashes. The frequency of motor 18
vehicle crashes will decrease by 1.32 with a percentage increase in proportion of home-based 19
trips. This is reasonable since drivers of a home-based trip are more familiar with the traffic 20
environment and have more cautious driving behaviors (Abdel-Aty et al., 2011). ‘Proportion of 21
college productions and attractions’ was negatively associated with motor vehicle and bicycle 22
crashes while was not statistically significant for pedestrian crashes. A percentage increase in 23
proportion of college trips will result in a 1.52 and 0.03 decrease in motor vehicle and bicycle 24
crashes. The decreased motor vehicle and bicycle crashes may be due to better traffic control 2 5
measures in these areas. However, higher proportion of college productions and attractions, 26
which is always related to better traffic control measures and high number of walking trips, leads 2 7
to mixed effects on pedestrian crashes. 28
4.2.6 Demographic characteristics 29
‘Density of total population’ was the influential macroscopic variable and was positively 30
associated with all three target variables. The marginal effects of population density near an 3 1
intersection showed that an increase in population density will increase the frequency of motor 32
vehicle, bicycle and pedestrian crashes by 2.48, 0.08, and 0.12. This agrees with previous studies 33
as a larger population is always consistent with more opportunities in terms of crash exposure 34
(Lee et al., 2015; Mitra and Washington, 2012; Pulugurtha and Sambhara, 2011). In addition, the 35
proportion of population between age 16 and 64 near an intersection was found to have negative 36
effects on pedestrian crashes. The frequency of pedestrian crashes will decrease by 0.04 with a 37
percentage increase in proportion of population aged 16-64. This may be due to middle-aged 38
people walking less in comparison to young and/or old people, as well as having better ability in 39
avoiding crash risk (Huang et al., 2010). 40
4.2.7 Commute behaviors 41
A percentage increase in proportion workers commuting by public transportation near an 42
intersection was associated with a 0.50 decrease in motor vehicles and a 0.10 increase in bicycle 43
crashes, indicating that the public transportation had opposite effects on motor vehicles and 44
bicycle crashes. In addition, ‘proportion of workers commuting by walking’ had significant and 45
positive associations with pedestrian crash occurrence. The number of pedestrian crashes will 46
15
increase by 0.14, with a percentage increase in proportion workers commuting by walking. This 1
result is not surprising since walking is always associated with the exposure of pedestrian 2
crashes. 3
5. Conclusions and recommendations 4
This paper sought to examine the effects of omitted macroscopic factors in crash-frequency 5
models by transportation modes at intersections. For this purpose, several separate 6
intersection-level crash-frequency model for motor vehicle, bicycle and pedestrian modes were 7
developed. Road characteristics related to geometric design and regulatory/control attributes and 8
traffic characteristics of the intersection entities, as well as macroscopic factors including trip 9
production/attraction, demographic and socio-economic characteristics and commute behaviors 10
at TAZ level surrounding the intersection, were used as explanatory variables. Those data 1 1
extracted for 279 intersections located in Hillsborough County, Florida, USA, were used for 12
model development. 13
The empirical analysis revealed a number of interesting findings. First, omission of 1 4
macroscopic variables would result in biased estimation of retained microscopic variables. 15
Results for marginal effects of traffic volumes and road features showed significant differences 1 6
between in the base model and the full model with macroscopic variables. For example, the safety 17
effect of minor approach AADT on motor vehicle and bicycle crashes are biased downwards by 18
16.8% and 17.6% in the absent of macroscopic variables. 19
Second, macroscopic variables had potential effects in tracking the heterogeneity of certain 20
risk factors. The results in the base model using RPNB approach showed that the safety effect of 21
‘presence of traffic signal’ was best fit with a normally distributed random parameter suggesting 22
‘presence of traffic signal’ had heterogeneous effects across intersections; while this variable 23
became no statistically significant in models with macroscopic variables. This implied that the 24
heterogeneous effects of ‘presence of traffic signal’ on motor vehicles crashes could be mostly 25
captured by these macroscopic variables, at least for the Hillsborough dataset. 26
Third, model comparison using log likelihood at convergence suggested that considering 27
macroscopic variables was vital in elevating the model performance. In addition, the values of 28
PRV were further calculated to assess the explanatory power of macroscopic variables. Results 29
showed the values of PRV in bicycle and pedestrian crash-frequency model were much higher 30
than in motor vehicle models, indicating that integrating macroscopic factors played a more 31
important role in developing non-motorized crash-frequency model than in developing motor 32
vehicle models. 33
Fourth, comparing model outputs developed based on 0.25 mile, 0.5 mile and 1 mile buffer 3 4
width data, models with macroscopic variables of 0.25 mile buffer width had the lowest AIC and 35
highest PRV, conversely, the models with macroscopic variables of 1 mile buffer width had the 36
highest AIC and lowest PRV in all three types of crash-frequency models by modes. Thus a 37
relatively smaller buffer width to extract macroscopic factors around the intersection would 38
provide a better estimate. 39
Finally, macroscopic factors of the surrounding zone of an intersection, such as ‘proportion 40
of home-based productions and attractions’, ‘proportion of college productions and attractions’, 41
‘density of total population’, ‘proportion of population between age 16 and 64’, proportion of 42
workers commuting by public transportation and ‘proportion of workers commuting by walking’, 4 3
were demonstrated to have significant effects on intersection crashes; while these variables are 44
always ignored in traditionally micro-level (e.g., intersections and segments) crash frequency 45
model. This indicated that not only traffic volumes and road features but also macroscopic factors 4 6
should be considered in estimating crash risk and identifying crash-prone locations. 47
The topic of integrating macroscopic factors in intersection/segment-level crash-frequency 4 8
16
model is emerging. This study has great research potential in pro-actively predicting crash risk 1
and identifying suitable countermeasures to reduce the crashes at “new” intersections/segments 2
as well as intersections/segments near “new” development. Nevertheless, several limitations 3
should be noted for this study. First, it is worthwhile to apply this model to other intersections 4
and regions in order to investigate spatial transferability of the calibrated models. In addition, 5
there may be some correlations among crash frequency by different transportation modes within 6
intersections, which is caused by some unobserved influential factors. Therefore, a simultaneous 7
model accounting for correlations of crashes among transportation modes, such as the 8
multivariate model, will be further explored to investigate this issue. 9
Acknowledgements 1 0
This work was jointly supported by: 1) Natural Science Foundation of China (No.71371192, 11
No. 71561167001); 2) the Research Fund for Fok Ying Tong Education Foundation of Hong Kong 1 2
(142005); and 3) Fundamental Research Funds for the Central Universities of CSU (No. 13
2016zzts050). We would like to thank Dr. Mohamed Abdel-Aty at the University of Central 14
Florida and the Florida Department of Transportation for providing the data. 15
References 16
Abdel-Aty, M., Siddiqui, C., Huang, H., Wang, X., 2011. Integrating trip and roadway 17
characteristics to manage safety in traffic analysis zones. Transportation Research Record: 1 8
Journal of the Transportation Research Board 2213, 20-28. 19
Anastasopoulos, P.C., Mannering, F.L., 2009. A note on modeling vehicle accident frequencies 20
with random-parameters count models. Accident Analysis and Prevention 41, 153-159. 2 1
Christoffel, T., Gallagher, S., 1999. Injury Prevention and Public Health: Practical Knowledge, 22
Skills, and Strategies. Aspen Publishers Inc., Gaithersburg, MD. 23
Dong, N., Huang, H., Zheng, L., 2015. Support vector machine in crash prediction at the level of 24
traffic analysis zones: Assessing the spatial proximity effects. Accident Analysis and 25
Prevention 82, 192–198. 26
Dong, N., Huang, H., Lee, J., Gao, M., Abdel-Aty, M., 2016. Macroscopic hotspots identification: A 27
Bayesian spatio-temporal interaction approach. Accident Analysis and Prevention 92, 28
256-264. 29
Huang, H., Abdel-Aty, M., Darwiche, A., 2010. County-level crash risk analysis in Florida: 30
Bayesian spatial modeling. Transportation Research Record: Journal of the Transportation 3 1
Research Board 2148, 27-37. 32
Huang H., Song B., Xu P., Zeng Q., Lee, J., Abdel-Aty, M., 2016. Macro and micro models for zonal 33
crash prediction with application in hot zones identification. Journal of Transport Geography, 3 4
54, 248-256. 35
Jacobsen, P.L., 2003. Safety in numbers: More walkers and bicyclists, safer walking and bicycling. 36
Injury Prevention 9, 205-209. 37
Lee, C., Abdel-Aty, M., 2005. Comprehensive analysis of vehicle–pedestrian crashes at 38
intersections in Florida. Accident Analysis and Prevention 37, 775-786. 39
Lee, J., Abdel-Aty, M., Jiang X., 2015. Multivariate crash modeling for motor vehicle and 40
non-motorized modes at the macroscopic level. Accident Analysis and Prevention 78, 41
146-154. 42
Lord, D., Mannering, F., 2010. The statistical analysis of crash-frequency data: A review and 43
assessment of methodological alternatives. Transportation Research Part A 44, 291-305. 44
Mitra, S., Washington, S., 2012. On the significance of omitted variables in intersection crash 45
modeling. Accident Analysis and Prevention 49, 439-448. 46
17
Miranda-Moreno, L.F., Morency, P., El-Geneidy, A.M., 2011. The link between built environment, 1
pedestrian activity and pedestrian–vehicle collision occurrence at signalized intersections. 2
Accident Analysis and Prevention 43, 1624-1634. 3
Mannering, F.L., Bhat, C.R., 2014. Analytic methods in accident research: methodological frontier 4
and future directions. Analytic Methods in Accident Research 1, 1-22. 5
Mannering, F.L., Shankar, V., Bhat, C.R., 2016. Unobserved heterogeneity and the statistical 6
analysis of highway accident data. Analytic Methods in Accident Research 11, 1-16. 7
NHTSA, National Highway Traffic Safety Administration, FARS Data, 2013. 8
www-fars.nhtsa.dot.gov/Main/index.aspx 9
Pulugurtha, S.S., Sambhara, V.R., 2011. Pedestrian crash estimation models for signalized 1 0
intersections. Accident Analysis and Prevention 43, 439-446. 11
Quddus, M., 2008. Modeling area-wide count outcomes with spatial correlation and 12
heterogeneity: an analysis of London crash data. Accident Analysis and Prevention 40, 13
1486-1497. 14
Raudenbush, S.W., Bryk, A.S., 2002. Hierarchical linear models: Applications and data analysis 15
methods, 2nd ed. Sage, Thousand Oaks. 16
Ryb, G.E., Dischinger, P.C., Kufera, J.A., Soderstrom, C.A., 2007. Social, behavioral and driving 17
characteristics of injured pedestrians: A comparison with other unintentional trauma patients. 18
Accident Analysis and Prevention 39, 313-318. 19
Strauss, J., Miranda-Moreno, L.F., Morency, P., 2013. Cyclist activity and injury risk analysis at 20
signalized intersections: a Bayesian modeling approach. Accident Analysis and Prevention 59, 21
9-17. 22
Strauss, J., Miranda-Moreno, L.F., Morency, P., 2014. Multimodal injury risk analysis of road users 23
at signalized and non-signalized intersections. Accident Analysis and Prevention 71, 201-209. 24
Ukkusuri, S., Miranda-Moreno, L.F., Ramadurai G., Isa-Tavarez, J., 2012. The role of built 2 5
environment on pedestrian crash frequency. Safety science 50, 1141-1151. 26
Venkataraman, N., Ulfarsson, G.F., Shankar, V.N., 2013. Random parameter models of interstate 27
crash frequencies by severity, number of vehicles involved, collision and location type. 28
Accident Analysis and Prevention 59, 309-318. 29
Wang, J., Huang, H., 2016. Road network safety evaluation using Bayesian hierarchical joint 30
model. Accident Analysis and Prevention 90, 152-158. 31
Xie, K., Wang, X., Huang, H., Chen, X., 2013. Corridor-level signalized intersection safety analysis 32
in Shanghai, China using Bayesian hierarchical models. Accident Analysis and Prevention 50, 33
25-33. 34
Xu, P., Huang, H., Dong, N., Abdel-Aty, M., 2014. Sensitivity analysis in the context of regional 35
safety modeling: Identifying and assessing the modifiable areal unit problem. Accident 36
Analysis and Prevention 70, 110–120. 37
Xu, P., Huang, H., 2015. Modeling crash spatial heterogeneity: Random parameter versus 38
geographically weighting. Accident Analysis and Prevention 75, 16-25. 39
Xu, P., Huang, H., Dong, N., Wong, S., 2016. Revisiting crash spatial heterogeneity: a Bayesian 40
spatially varying coefficients approach. Accident Analysis and Prevention, DOI: 41
10.1016/j.aap.2016.10.015. 4 2
Zeng, Q., Huang, H., 2014. Bayesian spatial joint modeling of traffic crashes on an urban road 43
network. Accident Analysis and Prevention 67, 105-112. 44
... Congestion delays have a significant impact on air quality, resulting in higher health risks for on-road users (Zhang and Batterman, 2013). Evidence has demonstrated that crashes between motorized vehicles and active transportation have a higher probability in zones with more transit stops; wider road width; dense and diverse land uses; and elevated traffic volumes (Amoh-Gyimah et al., 2016;Ukkusuri et al., 2012;Wang et al., 2017). ...
Article
Full-text available
Road space distribution has traditionally been based on the hierarchical classification of streets. In arterials, the majority of space is dedicated to traffic lanes, whereas local streets typically have fewer traffic lanes and more space for parking or sidewalks. Within urban areas, road space is contested between two main types of spaces: corridors of movement, and places for access and standing/stillness/staying. Given the limited availability of urban space, particularly in central areas, deciding how to allocate space for these functions poses a dilemma and requires tradeoffs. Nonetheless, certain areas experience underutilization and inefficiencies in space utilization over time. In this context, we propose a site selection methodology to identify complex zones within a city where different types of users and demands compete for space. These zones present the potential for dynamically allocating road space based on fluctuating demands and policy objectives. This methodology serves as an initial guide for planners to identify zones that require a thorough evaluation of activities and diverse temporal-spatial demands when reallocating road space. We use network centrality, land use indicators, traffic, and public transport dynamics indicators to detect complex zones and apply them to a Lisbon case study.
... Other studies identified road design and infrastructure as critical factors in cycling crashes, including the presence of dedicated cycling lanes and intersections [25][26][27]. Traffic calming measures, and adequate lighting can significantly reduce the risk of cycling crashes [28][29][30]. On the other hand, poor road conditions such as potholes, uneven surfaces, and inadequate signage can increase the risk of cycling crashes [31][32][33]. ...
Preprint
Full-text available
This manuscript presents a study on the spatial relationships between bike accidents, the built environment, land use, and transportation network characteristics in Budapest, Hungary using Geographic Weighted Regression (GWR). The sample period included bike crash data between 2017 and 2022. The findings provide insights into the spatial distribution of bike crashes and their severity, which can be useful for designing targeted interventions to improve bike safety in Budapest and be useful for policymakers and city planners in developing effective strategies to reduce the severity of bike crashes in urban areas. The study reveals that the built environment features, such as traffic signals, road crossings, and bus stops, are positively correlated with the bike crashes index, particularly in the inner areas of the city. However, traffic signals have a negative correlation with the bike crash index in the suburbs, where they may contribute to making roads safer for cyclists. The study also shows that commercial activity and PT stops have a higher impact on bike crashes in the northern and western districts. The GWR analysis further suggests that one-way roads and higher speed limits are associated with more severe bike crashes, while green and recreational areas are generally safer for cyclists. Future research should be focused on the traffic volume and bikes trips’ effects on the severity index.
... Also, fatalities for crashes on interstate and other freeways were found to be more likely for pedestrians walking under-the-influence (Wang and Cicchino 2020). A study done to observe the effect of zonal factors in estimating crash risk revealed a biased parameter estimation and incorrect inferences without macroscopic variables and concluded that macroscopic variables with a significant role in elevating model performance for non-motorized modes, i.e., active modes, compared to motorized modes (Wang et al. 2017). Thus, our study intends for microscopic analysis of active mode crashes in Utah, including macroscopic factors from the Census block group using data from the US Census and the Smart Location Database (SLD). ...
Conference Paper
This exploratory study’s goal was to identify factors associated with suspected alcohol/drug impairment among active transportation mode users involved in crashes. Crash and other spatial data were assembled and joined for over 7,000 bicycle and 9,000 pedestrian crashes with motor vehicles that occurred in Utah over a 12-year period (2010–2021). Overall, 56 bicycle crashes (0.8%) and 220 pedestrian crashes (2.4%) involving an impaired non-motorized were identified from crash data. To identify associated factors, bivariate and multivariate analyses were conducted. For both bicycle and pedestrian crashes, impairment was more likely: for older adults, on weekends, overnight, and in places with lower intersection density. For bicycle crashes, impairment was more likely in areas with smaller household sizes and with more liquor stores. For pedestrian crashes, impairment was more likely in the dark and in rural areas. While temporal factors are consistent with expectations about alcohol consumption on nights and weekends, locational findings indicate greater suspicion of impairment in places where walking/bicycling are less expected (rural areas and auto-oriented environments).
... This allows for negative binomial regression to be performed even if the dependent variable's mean and variance are not equal [16]. Negative binomial regression is commonly used for traffic safety applications because it has loosened restrictions in comparison to Poisson regression but is still capable of estimating an observed count, such as crash counts [17]. The negative binomial distribution takes the form presented in Equation (4), in which α is the reciprocal of the scale parameter of the gamma noise variable and other variables are as defined previously. ...
Article
Full-text available
Using surrogate safety measures is a common method to assess safety on roadways. Surrogate safety measures allow for proactive safety analysis; the analysis is performed prior to crashes occurring. This allows for safety improvements to be implemented proactively to prevent crashes and the associated injuries and property damage. Existing surrogate safety measures primarily rely on data generated by microsimulations, but the advent of connected vehicles has allowed for the incorporation of data from actual cars into safety analysis with surrogate safety measures. In this study, commercially available connected vehicle data are used to develop crash prediction models for crashes at intersections and segments in Salt Lake City, Utah. Harsh braking events are identified and counted within the influence areas of sixty study intersections and thirty segments and then used to develop crash prediction models. Other intersection characteristics are considered as regressor variables in the models, such as intersection geometric characteristics, connected vehicle volumes, and the presence of schools and bus stops in the vicinity. Statistically significant models are developed, and these models may be used as a surrogate safety measure to analyze intersection safety proactively. The findings are applicable to Salt Lake City, but similar research methods may be employed by researchers to determine whether these models are applicable in other cities and to determine how the effectiveness of this method endures through time.
Thesis
Full-text available
Recently, in many cities worldwide, transportation planning has focused on reallocating road space from automobile to more sustainable transport modes. Mostly in urban areas, road space (from façade to façade) is highly disputed by different urban activities, transport modes, and functions. Road space reallocation has often lacked consideration of demand fluctuations during hours, days, and seasons. Consequently, there are periods when certain spaces are oversupplied, while others are undersupplied. The hypothesis of this work is that there is a potential to allocate road space dynamically over time when demands are complementary or even disputed. Potentially, big data and emerging sensing technologies can be useful for short-term decision making, since these technologies can characterize different demands in real-time. Transport demand management and control technologies may be used for allocating space dynamically, informing the transitions to users, and connecting infrastructure to users (if/when needed). The main objective of the thesis is to explore where, when and how road space can be allocated dynamically over time. The thesis explores the applicability of allocating road space dynamically in areas of cities that have very disputed, but limited space to fulfill all demands. We propose a site selection methodology for choosing zones that are complex to reallocate road space and discuss the main local criteria necessary for different solutions of dynamic road space allocation. Also, the levels of technological adoption and requirements, along with the main challenges and opportunities for dynamic design are discussed in various technological contexts, ranging from no use of technology to a context where all transport modes and infrastructure are connected. Additionally, we discuss the risks and applicability of using artificial intelligence to improve public participation of both stakeholders and decision makers from interdisciplinary fields in road space allocation projects. In sum, we conclude that dynamic road space allocation is context oriented, where different solutions differ in terms of implementation complexity, technological requirements, and social acceptance. It is essential to have the right balance between the frequency of dynamic changes, the use of technology, and the number of proposed solutions, maintaining most of the street's characteristics in a logical orientation.
Article
Full-text available
This manuscript presents a study on the spatial relationships between bike accidents, the built environment, land use, and transportation network characteristics in Budapest, Hungary using geographic weighted regression (GWR). The sample period includes bike crash data between 2017 and 2022. The findings provide insights into the spatial distribution of bike crashes and their severity, which can be useful for designing targeted interventions to improve bike safety in Budapest and be useful for policymakers and city planners in developing effective strategies to reduce the severity of bike crashes in urban areas. The study reveals that built environment features, such as traffic signals, road crossings, and bus stops, are positively correlated with the bike crash index, particularly in the inner areas of the city. However, traffic signals have a negative correlation with the bike crash index in the suburbs, where they may contribute to making roads safer for cyclists. The study also shows that commercial activity and PT stops have a higher impact on bike crashes in the northern and western districts. GWR analysis further suggests that one-way roads and higher speed limits are associated with more severe bike crashes, while green and recreational areas are generally safer for cyclists. Future research should be focused on the traffic volume and bike trips’ effects on the severity index.
Chapter
This study focuses on the analysis and summary of HUD-related studies for the period 2017–2022. Based on the bibliometric approach, the 1009 papers obtained from the WOS (Web of Science) Core library search that were most relevant to the research topic were highlighted for analysis. Based on the traditional bibliometric analysis methods such as keyword frequency analysis and cluster analysis, it was found that the main research hotspots of automotive HUDs in the past five years include human-automation interaction, advanced driver assistance system, crossing behavior and so on. The analysis found that HUDs are closely integrated with assisted driving and autonomous driving scenarios. The analysis is based on a hierarchical analysis of human, automotive and HUD, which reveals that HUDs are applied to a number of scenarios such as early warning, navigation and infotainment in the car, and the driver's attention is allocated to a number of factors. This paper will help future researchers to understand the latest developments in automotive HUDs from a macro perspective, identify key areas for HUD research and application, and provide some insight and useful guidance for future research on the selection of HUD applications in driving scenarios.KeywordsDesign for social change in global marketsBibliometric StudyHead-up DisplayHUDLiterature ReviewAutomotive
Article
The realization of the many benefits of bicycling will not be achieved in American regions until safer bike infrastructure and bicycling conditions are presented to a more general population. The Phoenix region—one of the nation’s most populous—has sought policies and programs to increase bicycling rates. Yet, the region continues to have a small mode share, underscoring a need to motivate population-level bicycling adoption. This study examines 2015–2019 bicycle-vehicle crash data to identify those macro-level factors associated with bicycle-vehicle crashes and a subset of crashes where a serious injury or fatality occurred. Specifically, the effects of a robust set of socioeconomic and built environment factors, measured at three hexagon spatial extents, in negative binomial and spatial Durbin models were estimated for the two crash outcomes. Results show denser zones with a traditional network design experienced more bicyclist-involved crashes, as did zones with a higher percentage of low-income households and working-age adults. Findings, which also found spatial clustering of total and severe bicyclist-involved crashes, suggest that the targeted provision of safer bike infrastructure and a more complete network in zones exhibiting certain macro-level attributes holds promise in creating bike-friendly conditions that generate more utilitarian and recreational bicycling throughout the region.
Technical Report
Full-text available
There are great disparities in the quantity and quality of transport infrastructure. Differences in access to investment are often exacerbated by weak governance and an inadequate regulatory framework with poor enforcement which lead to high costs and defective construction. The wellbeing of many poor people is constrained by lack of transport, which is called ‘transport poverty’. This evidence and gap map identifies, maps and describes existing evidence on the effects of transport sector interventions related to all means of transport (roads, paths, cycle lanes, bridges, railways, ports, shipping, and inland waterways, and air transport). Suggested citation: Malhotra, S, White, H, de la Cruz, N, Saran, A, Eyers, J, John, D, Beveridge, E, Blondal, N 2021, Evidence and gap map-studies of the effectiveness of transport sector intervention in low and middle – income countries CEDIL/Campbell Gap Map 2021. Available at https://doi.org/10.51744/CSWP3
Article
Full-text available
This study proposes a Bayesian spatio-temporal interaction approach for hotspot identification by applying the full Bayesian (FB) technique in the context of macroscopic safety analysis. Compared with the emerging Bayesian spatial and temporal approach, the Bayesian spatio-temporal interaction model contributes to a detailed understanding of differential trends through analyzing and mapping probabilities of area-specific crash trends as differing from the mean trend and highlights specific locations where crash occurrence is deteriorating or improving over time. With traffic analysis zones (TAZs) crash data collected in Florida, an empirical analysis was conducted to evaluate the following three approaches for hotspot identification: FB ranking using a Poisson-lognormal (PLN) model, FB ranking using a Bayesian spatial and temporal (B-ST) model and FB ranking using a Bayesian spatio-temporal interaction (B-ST-I) model. The results show that (a) the models accounting for space-time effects perform better in safety ranking than does the PLN model, and (b) the FB approach using the B-ST-I model significantly outperforms the B-ST approach in correctly identifying hotspots by explicitly accounting for the space-time variation in addition to the stable spatial/temporal patterns of crash occurrence. In practice, the B-ST-I approach plays key roles in addressing two issues: (a) how the identified hotspots have evolved over time and (b) the identification of areas that, whilst not yet hotspots, show a tendency to become hotspots. Finally, it can provide guidance to policy decision makers to efficiently improve zonal-level safety.
Article
Full-text available
This study was performed to investigate the spatially varying relationships between crash frequency and related risk factors. A Bayesian spatially varying coefficients model was elaborately introduced as a methodological alternative to simultaneously account for the unstructured and spatially structured heterogeneity of the regression coefficients in predicting crash frequencies. The proposed method was appealing in that the parameters were modeled via a conditional autoregressive prior distribution, which involved a single set of random effects and a spatial correlation parameter with extreme values corresponding to pure unstructured or pure spatially correlated random effects.
Article
Full-text available
Zonal crash prediction has been one of the most prevalent topics in recent traffic safety research. Typically, zonal safety level is evaluated by relating aggregated crash statistics at a certain spatial scale to various macroscopic factors. Another potential solution is from the micro level perspective, in which zonal crash frequency is estimated by summing up the expected crashes of all the road entities located within the zones of interest. This study intended to compare these two types of zonal crash prediction models. The macro-level Bayesian spatial model with conditional autoregressive prior and the micro-level Bayesian spatial joint model were developed and empirically evaluated, respectively. An integrated hot zone identification approach was then proposed to exploit the merits of separate macro and micro screening results. The research was based on a three-year dataset of an urban road network in Hillsborough County, Florida, U.S. Results revealed that the micro-level model has better overall fit and predictive performance, provides better insights about the micro factors that closely contribute to crash occurrence, and leads to more direct countermeasures. Whereas the macro-level crash analysis has the advantage of requirement of less detailed data, providing additional instructions for non-traffic engineering issues, as well as serving as an indispensable tool in incorporating safety considerations into long term transportation planning. Based on the proposed integrated screening approach, specific treatment strategies could be proposed to different screening categories. The present study is expected to provide an explicit template towards the application of either technique appropriately.
Article
Full-text available
sec> Objective To examine the relationship between the numbers of people walking or bicycling and the frequency of collisions between motorists and walkers or bicyclists. The common wisdom holds that the number of collisions varies directly with the amount of walking and bicycling. However, three published analyses of collision rates at specific intersections found a non-linear relationship, such that collisions rates declined with increases in the numbers of people walking or bicycling. Data This paper uses five additional data sets (three population level and two time series) to compare the amount of walking or bicycling and the injuries incurring in collisions with motor vehicles. Results The likelihood that a given person walking or bicycling will be struck by a motorist varies inversely with the amount of walking or bicycling. This pattern is consistent across communities of varying size, from specific intersections to cities and countries, and across time periods. Discussion This result is unexpected. Since it is unlikely that the people walking and bicycling become more cautious if their numbers are larger, it indicates that the behavior of motorists controls the likelihood of collisions with people walking and bicycling. It appears that motorists adjust their behavior in the presence of people walking and bicycling. There is an urgent need for further exploration of the human factors controlling motorist behavior in the presence of people walking and bicycling. Conclusion A motorist is less likely to collide with a person walking and bicycling if more people walk or bicycle. Policies that increase the numbers of people walking and bicycling appear to be an effective route to improving the safety of people walking and bicycling. </sec
Article
This study proposes a Bayesian spatio-temporal interaction approach for hotspot identification by applying the full Bayesian (FB) technique in the context of macroscopic safety analysis. Compared with the emerging Bayesian spatial and temporal approach, the Bayesian spatio-temporal interaction model contributes to a detailed understanding of differential trends through analyzing and mapping probabilities of area-specific crash trends as differing from the mean trend and highlights specific locations where crash occurrence is deteriorating or improving over time. With traffic analysis zones (TAZs) crash data collected in Florida, an empirical analysis was conducted to evaluate the following three approaches for hotspot identification: FB ranking using a Poisson-lognormal (PLN) model, FB ranking using a Bayesian spatial and temporal (B-ST) model and FB ranking using a Bayesian spatio-temporal interaction (B-ST-I) model. The results show that (a) the models accounting for space-time effects perform better in safety ranking than does the PLN model, and (b) the FB approach using the B-ST-I model significantly outperforms the B-ST approach in correctly identifying hotspots by explicitly accounting for the space-time variation in addition to the stable spatial/temporal patterns of crash occurrence. In practice, the B-ST-I approach plays key roles in addressing two issues: (a) how the identified hotspots have evolved over time and (b) the identification of areas that, whilst not yet hotspots, show a tendency to become hotspots. Finally, it can provide guidance to policy decision makers to efficiently improve zonal-level safety.
Article
Highway accidents are complex events that involve a variety of human responses to external stimuli, as well as complex interactions between the vehicle, roadway features/condition, traffic-related factors, and environmental conditions. In addition, there are complexities involved in energy dissipation (once an accident has occurred) that relate to vehicle design, impact angles, the physiological characteristics of involved humans, and other factors. With such a complex process, it is impossible to have access to all of the data that could potentially determine the likelihood of a highway accident or its resulting injury severity. The absence of such important data can potentially present serious specification problems for traditional statistical analyses that can lead to biased and inconsistent parameter estimates, erroneous inferences and erroneous accident predictions. This paper presents a detailed discussion of this problem (typically referred to as unobserved heterogeneity) in the context of accident data and analysis. Various statistical approaches available to address this unobserved heterogeneity are presented along with their strengths and weaknesses. The paper concludes with a summary of the fundamental issues and directions for future methodological work that addresses unobserved heterogeneity.
Article
Safety and efficiency are commonly regarded as two significant performance indicators of transportation systems. In practice, road network planning has focused on road capacity and transport efficiency whereas the safety level of a road network has received little attention in the planning stage. This study develops a Bayesian hierarchical joint model for road network safety evaluation to help planners take traffic safety into account when planning a road network. The proposed model establishes relationships between road network risk and micro-level variables related to road entities and traffic volume, as well as socioeconomic, trip generation and network density variables at macro level which are generally used for long term transportation plans. In addition, network spatial correlation between intersections and their connected road segments is also considered in the model. A road network is elaborately selected in order to compare the proposed hierarchical joint model with a previous joint model and a negative binomial model. According to the results of the model comparison, the hierarchical joint model outperforms the joint model and negative binomial model in terms of the goodness-of-fit and predictive performance, which indicates the reasonableness of considering the hierarchical data structure in crash prediction and analysis. Moreover, both random effects at the TAZ level and the spatial correlation between intersections and their adjacent segments are found to be significant, supporting the employment of the hierarchical joint model as an alternative in road-network-level safety modeling as well.
Article
Injury Prevention and Public Health: Practical Knowledge, Skills, and Strategies, Second Edition presents the complex nature of injuries and violence but provides this information in a highly comprehensible manner. The authors' devotion to advocacy for the prevention of injuries, both unintentional and intentional, makes this title an essential read for both public health students and public health professionals.