ArticlePDF Available

Integrating macro-and micro-level safety analyses: a Bayesian approach incorporating spatial interaction Integrating macro-and micro-level safety analyses: a Bayesian approach incorporating spatial interaction

Authors:

Abstract and Figures

Traditionally, crash frequency analyses have been undertaken at the macro-and micro-levels, independently. This study proposes a Bayesian integrated spatial crash frequency model, which links the crash counts of macro-and micro-levels based on the spatial interaction. In addition, the proposed model considers the spatial auto-correlation of the different types of road entities (i.e. segments and intersections) at the micro-level with a joint structure. The modelling results indicated that the integrated model can provide better model performance for estimating macro-and micro-level crash counts, which validates the concept of integrating the models for the two levels. Also, the integrated model could simultaneously identify both macro-and micro-level factors contributing to the crash occurrence. Subsequently, a novel hotspot identification method was suggested, which enables us to detect hotspots for both macro-and micro-levels with comprehensive information from the two levels. ARTICLE HISTORY
Content may be subject to copyright.
Full Terms & Conditions of access and use can be found at
http://www.tandfonline.com/action/journalInformation?journalCode=ttra21
Transportmetrica A: Transport Science
ISSN: 2324-9935 (Print) 2324-9943 (Online) Journal homepage: http://www.tandfonline.com/loi/ttra21
Integrating macro- and micro-level safety analyses:
a Bayesian approach incorporating spatial
interaction
Qing Cai, Mohamed Abdel-Aty, Jaeyoung Lee & Helai Huang
To cite this article: Qing Cai, Mohamed Abdel-Aty, Jaeyoung Lee & Helai Huang (2018):
Integrating macro- and micro-level safety analyses: a Bayesian approach incorporating spatial
interaction, Transportmetrica A: Transport Science, DOI: 10.1080/23249935.2018.1471752
To link to this article: https://doi.org/10.1080/23249935.2018.1471752
Accepted author version posted online: 30
Apr 2018.
Published online: 10 May 2018.
Submit your article to this journal
Article views: 108
View Crossmark data
Citing articles: 1 View citing articles
TRANSPORTMETRICA A: TRANSPORT SCIENCE
https://doi.org/10.1080/23249935.2018.1471752
Integrating macro- and micro-level safety analyses:
a Bayesian approach incorporating spatial interaction
Qing Caia, Mohamed Abdel-Atya, Jaeyoung Leeaand Helai Huangb
aDepartment of Civil, Environment and Construction Engineering, University of Central Florida, Orlando, FL,
USA; bUrban Transport Research Center, School of Traffic and Transportation Engineering, Central South
University, Changsha, People’s Republic of China
ABSTRACT
Traditionally, crash frequency analyses have been undertaken at
the macro- and micro-levels, independently. This study proposes a
Bayesian integrated spatial crash frequency model, which links the
crash counts of macro- and micro-levels based on the spatial inter-
action. In addition, the proposed model considers the spatial auto-
correlation of the different types of road entities (i.e. segments and
intersections) at the micro-level with a joint structure. The modelling
results indicated that the integrated model can provide better model
performance for estimating macro- and micro-level crash counts,
which validates the concept of integrating the models for the two
levels. Also, the integrated model could simultaneously identify both
macro- and micro-level factors contributing to the crash occurrence.
Subsequently, a novel hotspot identification method was suggested,
which enables us to detect hotspots for both macro- and micro-levels
with comprehensive information from the two levels.
ARTICLE HISTORY
Received 7 October 2017
Accepted 29 April 2018
KEYWORDS
Integrated model; macro-
and micro-level crash
frequency; spatial
interaction; hotspot
identification; Bayesian
modeling
1. Introduction
In the last few decades, there has been a growing recognition of the importance of safety in
transportation research. Initially, the Transportation Equity Act for the twenty-first century
(Houston 1998) suggested to consider safety in the transportation planning process. Later,
Washington et al. (2006) discussed how to incorporate safety into transportation planning
at the different levels. Currently, the Moving Ahead for Progress in the twenty-first century
Act (MAP-21 Act) (US Congress 2012) and Fixing America’s Surface Transportation Act (FAST
Act) (U.S. DOT 2015) require the incorporation of transportation safety in the long-term
transportation planning process.
One of the most widely used approaches to investigate traffic safety is crash frequency
modeling, which can quantify exogenous factors contributing to the number of traffic
crashes. Traditionally, crash frequency analyses have been adopted for both macro- and
micro-levels. However, previous studies have explored traffic safety at either the micro- or
micro-level, i.e. to the best of our knowledge no study has integrated the two levels. If traffic
CONTACT Qing Cai qingcai@knights.ucf.edu Department of Civil, Environment and Construction Engineering,
University of Central Florida, Orlando, FL, USA
© 2018 Hong Kong Society for Transportation Studies Limited
2Q.CAIETAL.
safety research is conducted for the same study area, macro- and micro-level crash analy-
ses would investigate the same crashes but by different aggregation levels. Hence, we can
assume that the crash counts at the two levels are correlated. Particularly, the total number
of crashes in each zone (macro-level) is supposed to be the same as the total number of
crashes from all road entities including segments and intersections (micro-level) located in
the zone of interest. Therefore, an integrated crash frequency analysis might improve the
model performance and can help in better understanding the crash mechanisms as well.
As a result, more effective and efficient countermeasures can be provided for both macro
and micro-levels to enhance transportation safety.
This study aims to propose an integrated model to deal with the following issues: (1) to
investigate transportation safety problems at the macro- and micro-levels, simultaneously;
(2) to handle the potential correlation of crash counts between the macro- and micro-levels
based on the spatial interactions between the two different aggregation levels; (3) to con-
sider the spatial autocorrelation of the road entities (i.e. segments and intersections) by
employing a joint model structure at the micro-level.
2. Literature review
A wide array of research efforts has examined traffic crash frequency (see Lord and Man-
nering 2010 for a detailed review). These studies have been conducted for different scales:
macro- (e.g. traffic analysis zone (TAZ), county) and micro-levels (e.g. intersection, seg-
ment). Frequency models for crash analysis include Poisson, negative binomial, Poisson-
lognormal, and other count models. To deal with the spatial–temporal correlation, excess
zeros, multilevel heterogeneity among observations, models structures with finite mix-
ture/latent class (Yasmin and Eluru 2016), zero inflation (Cai et al. 2016; Dong et al. 2014),
random effects (Lee et al. 2015; Lee, Abdel-Aty, and Jiang 2015), and random parameters
(Anastasopoulos and Mannering 2009) have been investigated.
Different characteristics aggregated at the macro-level have been examined, including
traffic patterns such as vehicle-miles travelled (VMT) (Abdel-Aty et al. 2013) and propor-
tion of heavy VMT (Cai et al. 2016), road characteristics such as road length with different
functional classification (Quddus 2008), road length with different speed limits (Abdel-Aty
et al. 2013; Siddiqui and Abdel-Aty 2012), and intersection density (Cai et al. 2016;Huang,
Abdel-Aty, and Darwiche 2010;Wangetal.2018), socioeconomic factors such as popula-
tion density (Huang, Abdel-Aty, and Darwiche 2010), commuting modes (Cai et al. 2016),
household income (Xu and Huang 2015), and employment (Quddus 2008).
On the other hand, various factors have been considered in micro-level crash analysis,
including traffic such as annual average daily traffic (AADT) (Abdel-Aty and Radwan 2000),
average speed (Huang et al., ‘Predicting Cash Frequency,’ 2016; Taylor, Baruya, and Kennedy
2002;Wangetal.2015; Wang, Abdel-Aty, and Lee 2017), speed variance (Garber and Ehrhart
2000); geometric features such as median width (Yu and Abdel-Aty 2013), degree of cur-
vature (Shankar, Mannering, and Barfield 1995), number of lanes (Wang and Abdel-Aty
2008;Yeetal.2009), pavement conditions (Anastasopoulos and Mannering 2009; Lee, Nam,
and Abdel-Aty 2015), control type such signal phasing scheme (Wong, Sze, and Li 2007),
and environment such as weather condition (Yu and Abdel-Aty 2013). Beside the variables
based on road entities, variables at macro-level such as population density (Park et al. 2015),
TRANSPORTMETRICA A: TRANSPORT SCIENCE 3
commuting characteristics (Lee, Abdel-Aty, and Cai 2017), and network density (Wang and
Huang 2016) have been adopted for the crash analysis for road entities.
There exist significant differences between macro- and micro-level crash analyses. First,
the crash analyses at the two levels have different implementations. At the macro-level,
crashes at segments and intersections are aggregated and analyzed to quantify the impacts
of various area-wide factors such as socioeconomic, transportation demand, and road
network attributes to provide countermeasures from a long-term planning perspective.
On the other hand, the micro-level crashes collected based on segments or intersec-
tion are analyzed to identify the influence of geometric design, lighting, and traffic flow
characteristics with the objective of offering engineering solutions. Second, the macro-
level analysis relates crash counts to factors at the zonal level, which does not account
for the heterogeneity of road entities from the micro-level. Meanwhile, the micro-level
analysis investigates crash counts for each road entity without considering the homo-
geneity of road entities in the same zone at the macro-level. Third, crash counts have
different scales since they are aggregated at different levels and it is expected that more
zero counts can be observed at the micro-level. Thus, the quantity effects of exogenous
variables should be different. However, these safety analyses at the two levels also have
common aspects. First, the crash counts are obtained by aggregating the same crashes
that happen in the study area. Hence, the total crash counts in all road entities in a
zone of interest are supposed to be equal to the crash frequency in this zone. Besides,
the traffic- and road-related variables are correlated at two levels since these variables
at the macro-level are aggregated from the micro-level. For example, the variable total
road length at the macro-level is collected by summing up the length of all roads at the
micro-level. Hence, it is not surprising that similar contributing factors and patterns can
be observed in the macro- and micro-level analyses. For instance, previous studies have
found that traffic exposure (e.g. VMT at the macro-level, AADT at the micro-level) have pos-
itive associations with the crash counts (Abdel-Aty et al. 2013; Wang, Abdel-Aty, and Lee
2017).
Because of the same study target of the two levels, the macro- and micro-level crash
frequency analyses might support each other and it is supposed to help better under-
stand crash occurrence by integrating the analyses at the two levels with a model structure.
Besides, the integrated model structure could analyze crashes with the consideration of
both heterogeneity and homogeneity of road entities. Abdel-Aty et al. (2016) and Lee,
Abdel-Aty, and Jiang (2014) suggested a methodology to integrate macro- and micro-level
data to provide a comprehensive perspective for crash occurrence. Still, their methodol-
ogy is based on the macro-level crash models. No earlier study in the literature, to our
knowledge, has attempted to conduct crash analyses at macro- and micro-levels with
an integrated modeling approach. Hence, this study intends to propose an integrated
model, which can simultaneously examine the transportation safety problem at the two
levels.
The rest of the paper is organized as follows: Section 3 proposes an integrated
model after introducing non-integrated models for the macro- and micro-levels. Section
4 describes the collected data for the empirical analysis of the proposed model. The fol-
lowing section compares model performance and discusses the modeling results. Based
on the modeling results, a novel of integrated hotspot screening process is suggested and
screening results are presented in Section 6. Finally, the last section concludes the paper.
4Q.CAIETAL.
3. Methodology
3.1. Bayesian non-integrated spatial model
3.1.1. Bayesian non-integrated spatial model at the macro-level
Traditional Poisson and negative binomial models have been widely used in the previous
macro-level traffic safety literature. Nevertheless, the models do not consider a possible
spatial correlation of traffic crash counts between adjacent zones, which may yield biased
modeling results (Hadayeghi, Shalaby, and Persaud 2010; Quddus 2008). By incorporating
an error term for possible spatial autocorrelation, the Bayesian spatial Poisson lognormal
model with Conditional Autoregressive (CAR) prior can provide more appropriate analy-
sis results and has been widely adopted in macro-level crash analysis (Huang, Abdel-Aty,
and Darwiche 2010; Lee, Abdel-Aty, Choi et al. 2015; Miaou, Song, and Mallick 2003;Cai,
Abdel-Aty, and Lee 2017; Quddus 2008; Siddiqui and Abdel-Aty 2012).
The spatial model for the macro-level can be expressed as
yzone
iPoisson zone
i),(1)
logzone
i)=βzonexzone
i+θzone
i+φzone
i,(2)
where yzone
iis the number of total crashes in zone i,λzone
iis the expected value of yzone
i.
xzone
iis a set of explanatory variables while βzone is the corresponding parameters. θzone
i
is a random effect accounting for the unstructured over-dispersion that follows a normal
distribution:
θzone
iN(0, 1
τh
),(3)
where τhis the precision parameter (the inverse of the variance) which follows a prior
gamma (0.001, 0.001).
φzone
iis a random effect term which is used to deal with the spatial autocorrelation
among zones. φzone
ifollows a normal distribution with CAR prior suggested by Besag, York,
and Mollié (1991):
φzone
iNi=jwzone
ij φzone
j
i=jwzone
ij
,1
τci=jwzone
ij ,(4)
in which wzone
ij is the binary entries of proximity matrix with a value of 1 if zones iand j
share border or 0 otherwise. τcis the precision parameter, which also follows a prior gamma
(0.001,0.001).
The proportion of variability in the random effects due to spatial autocorrelation can be
calculated as
αzone =sdzone
i)
sd zone
i)+sdzone
i),(5)
where sd(·) represents the empirical marginal standard deviation function.
3.1.2. Bayesian non-integrated spatial model at the micro-level
At the micro-level, road entities located in a close proximity may also share similar fac-
tors, resulting in spatial autocorrelation of traffic crashes among road entities. Compared
TRANSPORTMETRICA A: TRANSPORT SCIENCE 5
with solely spatial autocorrelation between segments or intersections, the spatial correla-
tion effects between adjacent segments and intersections may be more significant if they
are directly connected with each other. To this end, Zeng and Huang (2014) proposed a
Bayesian spatial joint model that simultaneously analyzes the crash frequency of segments
and intersections. The model introduced an indicator γmto distinguish whether a road
entity is a segment or an intersection since the segments and intersections should have
different exogenous factors affecting traffic safety. Specifically, the value of γmis 1 if road
entity mis a segment and γmis 0 if the road entity is an intersection. Then, the model at
micro-level is as follows:
yEntity
mPoisson entity
m),(6)
logentity
m)=γm×(βsegxseg
m+log(lengthseg
m)) +(1γm)×(βinterxinter
m)
+θentity
m+φentity
m,(7)
where yentity
mis the number of crashes at segment or intersection m. xseg
mand xinter
mdenote
the set of explanatory variables of segments and intersections while βseg and βinter are the
corresponding parameters. log(lengthseg
m)is logarithm of the length of road entity mif it
is a segment, otherwise it is 0. Similar to the spatial model at the macro-level, θentity
mand
φentity
mrepresent the two random effects which are used to account for the unstructured
over-dispersion effect and spatial correlation effect, separately. The spatial random effect
φentity
mis also assumed to have a CAR prior. If two road entities mand ndirectly connect
with each other the weight in the spatial proximity matrix wentity
mn is set to be 1, otherwise,
the weight is 0. This approach in the joint model can not only capture the spatial correlation
of road entities of the same type but also the two different types of road entities including
segments and intersections.
3.2. Bayesian integrated spatial model at the two levels
Figure 1presents three GIS layers illustrating the spatial relations between crashes, road
entities (micro-level), and zones (macro-level). As shown in Figure 1, the same crashes in the
study area are aggregated at the macro- and micro-levels for the crash analyses. Hence, the
crash count of a zone is supposed to be the same as the total crashes of all road entities in
the same zone of interest. Let a matrix Wdenote the relation of spatial interaction between
zones and road entities. The spatial interaction matrix wmi is assigned a value of 1 if a road
entity m is located in zone ior 0 otherwise. If ˆ
izones and ˆ
mroad entities included in the
study, a ˆ
m׈
ispatial dependence matrix can be generated. Then, the relation between
observed crashes at the macro- and micro-levels can be expressed as follows:
yzone
i=
k
m=1
yentity
mwmi.(8)
Based on the equivalence relation presented in Equation (9), the non-integrated mod-
els for the macro- and micro-levels can be linked. However, the expected crash counts at
the macro-level might not be the same as the total expected number of crashes at the
micro-level since they are estimated at different levels with different explanatory variables.
6Q.CAIETAL.
Figure 1. Illustration of spatial relation among crashes, road entities, and zones.
Therefore, an adjusted factor is introduced to relax the equivalence constraint. The link
function between the macro- and micro-levels can be specified as
uzone
i=
k
m=1
λentity
mwmi,(9)
λzone
i=uzone
i×ADJi, (10)
ADJi=exp(βzonexzone
i+θzone
i+φzone
i), (11)
where uzone
iis the total expected crashes (λentity
m)of all road entities in zone iand the λentity
m
can be estimated based on the non-integrated spatial model at the micro-level (Equation
(7)). ADJiis the adjustment factor of uzone
iand λiis the expected number of crashes in zone
ibased on the non-integrated spatial model at the macro-level (Equation (2)). The adjust-
ment factor can represent that how many different crashes will happen in a zone given
the same road network but with different socio-demographic characteristics. Hence, only
macro-level socioeconomic variables are adopted for the estimation of the adjust factor
ADJi. Also, θzone
iand φzone
iare two random terms to capture the unobserved and spatial
autocorrelation effects at the macro-level. In the integrated approach, the expected crash
counts of road entities (λentity
m) are estimated by Equation (7) subjected to the relation with
the crash count of zones shown in Equations (9) and (10). Meanwhile, the expected crash fre-
quencies of zones are the product of the total expected crash counts of all road entities and
the adjustment factors (see Equations (10) and (11)). Hence, based on the integrated model
structure with Equations (1), (6)–(8), and (9)– (11), the crashes at the macro- and micro-levels
can be investigated, simultaneously.
All the models were coded and estimated by using WinBUGS, which is a popular pro-
gramming platform for Bayesian inference. The significant explanatory variables were
determined based on 95% certainty of Bayesian credible intervals (BCIs). Deviance informa-
tion criterion (DIC) was used to measure models’ performance and determine the best set
TRANSPORTMETRICA A: TRANSPORT SCIENCE 7
of parameters for each model. DIC is a common measurement for Bayesian model compari-
son and a lower DIC value is preferred. Roughly, differences of more than 10 might indicate
that the model with lower DIC performs better (El-Basyouny and Sayed 2009).
3.3. Measurement for model comparison
Besides the DIC mentioned above, two additional measures were employed to compare
the model performance at both the macro- and micro-levels. Mean Absolute Error (MAE)
computes the mean of absolute errors with the following equation:
MAE =1
N
N
i=1
|yiy
i|, (12)
where Nis the number of observations, yiand y
iare the observed and predicted number
of crashes of site iat the macro- and micro-levels.
Root Mean Squared Errors (RMSE) calculates the square root of the sum of the squared
error divided by the number of observations as follows:
RMSE =
1
N
N
i=1
(yiy
i)2. (13)
4. Empirical data
Data set were elaborately collected based on 78 traffic analysis districts (TADs) in Orlando,
Florida to demonstrate the empirical application of the proposed model. TADs are newly
developed transportation-related geographic units by combining existing TAZs (FHWA
2011). A TAD is considerably larger than a TAZ (Cai et al. 2017). Usually, TAZs nest within
counties while TADs may cross county boundaries, but they must nest within metropolitan
planning organizations (FHWA 2011). In the previous safety study, TAZs have been widely
adopted since they are easier to be used to integrate traffic safety with the transportation
planning process (Abdel-Aty et al. 2013;Yangetal.2018). However, TAZs are often delin-
eated by arterial roads and thus many crashes occur on zone boundaries. The existence
of boundary crashes may invalidate the assumptions of correlating crashes with only the
characteristics of a zone where the crash is spatially located (Lee, Abdel-Aty, and Jiang 2015;
Siddiqui and Abdel-Aty 2012). Also, the size of a TAZ might be very small, especially in urban
areas, and it is more likely that a driver who causes a crash in a TAZ comes from other TAZs. If
this is the case, the characteristics of the driver’s residence cannot be considered in the TAZ-
based models. Cai et al. (2017) compared the performance of TAZ- and TAD-based crash
models, and concluded that TAD-based models are superior for macro-level crash anal-
ysis and transportation safety planning. Hence, in our study, the TADs were adopted for
analysis at macro-level. In the same study area, a total of 3316 road entities including 2434
segments and 882 intersections on major roads were identified for the analysis (Figure 2). It
is noteworthy that there are more segments and intersections in the study area. However,
the traffic data were not available for all roads especially local roads. Thus, only segments
and intersections with available traffic data were selected and their traffic and crashes were
aggregated at the macro-level. Most segments and intersections on freeways/expressways,
8Q.CAIETAL.
Figure 2. Selected TADs and road network in Orlando, Florida: overall study area (left); TADs (upper
right) and road network (bottom right) in Downtown Orlando.
arterials, and collectors were included in this study. In order to have consistent crash data
at the two levels, the crashes that occurred at segments and intersections without traffic
data were excluded and other roadway information data on these road entities were not
included either. However, the proposed model can be easily extended to include all the
crashes once all road entities have traffic data.
The spatial interaction between TADs and road entities were processed by using ArcGIS
10.2 (ESRI) based on the digital maps provided by the U.S. Census Bureau (USCB) and Florida
Department of Transportation (FDOT). As noted above, a lot of segments and intersections
are located on the boundaries of TAZs since one of the zoning criteria of TAZs is to recog-
nize physical boundaries such as arterial (Cai et al. 2017; Lee, Abdel-Aty, and Jiang 2014)
and the size of a TAZ is quite small (on average 5.50 mile2in Orlando). However, the TADs
were developed by combining the existing TAZs and the size of a TAD is sufficiently larger
(on average 36.59 mile2). Hence, most of road entities could be located inside of TADs. If
a road entity is located on the boundaries of two or more TADs, the geospatial method
was applied to assign them into TADs. Specifically, each intersection was assigned into
a TAD if the intersection is located within the digital boundary of the TAD. Meanwhile,
each segment was allocated into a TAD if the segment is most proportionally in the cor-
responding TAD. Hence, the one-to-one spatial interaction between TADs (macro-level)
and road entities (micro-level) can be obtained. A 3316 ×78 spatial dependence matrix
can be generated corresponding to the 3316 road entities and 78 TADs. Also, the spatial
autocorrelation matrix only for TADs or road entities can be obtained by applying spatial
join features in ArcGIS. The descriptive statistics for the spatial relations are presented in
TRANSPORTMETRICA A: TRANSPORT SCIENCE 9
Tab le 1. Descriptive statistics for spatial relations.
Variables Definition Mean S.D. Min. Max.
Spatial autocorrelation between TADs
N_TAD_NEI Number of neighbors among TADs 5.80 1.55 2 10
Spatial autocorrelation between road entities
N_ENTITY_NEI Number of neighbors among road entities 3.03 2.09 0 21
Spatial dependence between TADs and road entities
N_TAD_ENTITY Number of road entities in each TAD 42.51 29.13 5 189
Table 1. Remarkably, all TADs have adjacent TADs and each TAD has at least five road enti-
ties. Besides, the maximum number of neighbors among road entities is 21, which might
be because some long segments connect a lot of intersections and other segments.
The crashes that occurred in Orlando during 2010–2012 were collected from the FDOT’s
Crash Analysis Reporting System (CARS) and Signal Four Analytics (S4A) database. In the
database, crashes occurring within 50 feet and 250 feet away from the intersection are
defined as ‘crashes at intersection’ and ‘crashes influenced by intersection’, respectively.
According to this principle, a 250 feet buffer around each intersection were created and
crashes in the buffers were collected and classified as intersection-related crashes while
other crashes were categorized as segment-related crashes. Then, the crashes in each TAD
can be obtained by summing up the crash counts of all road entities in the corresponding
TAD according to the spatial interaction.
A host of explanatory variables were considered for the analysis, including traffic data,
roadway, demographic, socioeconomic, and land-use factors. The traffic and road data in
the road entities were first collected from FDOT and then spatially attached to the corre-
sponding TADs in a similar way as crashes. The socio-demographic data were attained from
the USCB. These census tracts-based data could be aggregated to TADs since a TAD is a
combination of multiple census tracts (Cai et al. 2017). The land-use data were provided by
the Florida Department of Revenue based on block groups. As the block groups are even
smaller than census tracts (Abdel-Aty et al. 2013), the land-use data could be easily to be
aggregated to TADs. The descriptive statistics of the collected data based on TADs and road
entities are summarized in Tables 2and 3, respectively.
5. Model estimation
5.1. Model comparison
As discussed above, three models were estimated in this study, i.e. (1) a non-integrated
model for the macro-level, (2) a non-integrated model for the micro-level, and (3) an
integrated model for both levels. Prior to discussing the model results, we present the per-
formance results of the estimated models in Table 4. The table presents the DIC, MAE, and
RMSE for the two levels based on the results of non-integrated and integrated models. Sev-
eral observations can be made according to the results presented in Table 4. At the macro-
level, the integrated model can provide significantly smaller values of the three measures
compared with the non-integrated model. Specifically, the DIC difference for macro-level
is 44.99, which indicates significant difference between the two models (El-Basyouny and
Sayed 2009). Likewise, the prediction accuracy of crash frequency for macro-level in the
integrated model is improved by 27.99% and 18.57%, respectively, based on the MAE and
10 Q.CAIETAL.
Tab le 2. Descriptive statistics of collected data for TADs (macro-level).
Variables Definition Mean S.D. Min. Max.
CRASH Average annual crash count for each TAD 257.03 213.17 18 1038
DVMT Daily VMT (in thousand) 494.53 440.19 23.30 2210.21
Segment-related variables
ROAD_LENGTH Total road length in each TAD (mi) 23.60 29.72 1.53 248.65
P_FREEWAY Proportion of segment length of freeway 0.14 0.17 0 0.71
P_ARTERIAL Proportion of segment length of arterial 0.40 0.21 0 0.74
P_COLLECTOR Proportion of segment length of collector 0.46 0.22 0 1
P_LOCALROAD Proportion of segment length of local road 0.01 0.03 0 0.23
P_LANE1_2 Proportion of segment length with 1 or 2 lanes 0 0.00 0 0.03
P_LANE3_4 Proportion of segment length with3 or 4 lanes 0.39 0.22 0 0.87
P_LANE5MORE Proportion of segment length with 5 lanes or over 0.16 0.17 0 0.74
P_MEDIANROAD Proportion of segment length having median 0.68 0.22 0.10 1
Intersection-related variables
INTER_DENS Number of intersections per mile (/mile) 1.70 0.57 1 4.33
P_SINGAL Proportion of signalized intersections 0.78 0.24 0 1
P_LEG3 Proportion of intersections with 3 legs 0.32 0.17 0 0.73
P_LEG4 Proportion of intersections with 4 legs 0.67 0.18 0 1
Socio-demographic variables
POP_DENS Population density (in thousand) 2.38 1.49 0.02 6.56
P_AGE1524 Proportion of population aged 15–24 0.16 0.05 0.09 0.38
P_AGE65MORE Proportion of population aged 65 or over 0.10 0.03 0.04 0.18
COMMUTERS_DENS Commuters density (/mile2) 1163.12 728.39 9.32 3103.77
MEDIAN_INC Median household income (in thousand) 63.40 19.47 33.99 122.77
DIS_URBAN Distance to the nearest urban area (mile) 1.40 1.71 1.00 14.12
Land-use variables
P_BAR Proportion of nightclubs, cocktail lounges, and bar area 0.0002 0.0004 0 0.001
P_TOU Proportion of tourist attraction, hotels, and motels area 0.002 0.01 0 0.05
P_SU Proportion of schools and colleges area 0.02 0.04 0 0.33
P_RES Proportion of residential area 0.43 0.21 0.01 0.94
P_IND Proportion of industrial area 0.03 0.04 0 0.28
P_AGR Proportion of agriculture area 0.10 0.15 0 0.73
RMSE. On the other hand, the integrated model can provide significantly smaller DIC for
the micro-level compared with the non-integrated model as well. Besides, the goodness-
of-fit for the micro-level is improved by 21.16% and 23.33% according to the values of MAE
and RMSE, respectively. Hence, in terms of the comparison results, we can generally con-
clude that the proposed integrated model is preferable for crash frequency analysis at both
macro- and micro-levels with better overall statistical fit.
The model comparison results discussed above indicate that the proposed integrated
model can improve the crash frequency prediction and analysis at the macro- and micro-
levels. The findings are somewhat not surprising. At the macro-level, a possible explanation
may be the less aggregated traffic and road variables from the micro-level were adopted for
the zonal crashes estimation and the explanatory factors associated with the crash risk from
the micro-level may be more direct and specific to crash circumstances (Huang et al., ‘Macro
and Micro Models,’ 2016). In comparison, the non-integrated model for the macro-level
crash frequency analysis adopts a list of aggregated traffic and roadway variables from the
micro-level together with socio-demographic variables based on the macro-level. Hence,
the non-integrated model for macro-level cannot consider the heterogeneity of different
road entities since the potential variation is neutralized by the aggregation of data. At the
micro-level, the crash counts of segments and intersection were estimated with a set of
micro-level factors by using Equation (7). However, based on Equations (9)–(11), the inte-
grated model should have an additional constraint for road entities, i.e. the total crashes
TRANSPORTMETRICA A: TRANSPORT SCIENCE 11
Tab le 3. Descriptive statistics of collected data for road entities (micro-level).
Variables Definition Mean S.D. Min. Max.
Segment variables
CRASH Average annual crash count for each segment 6.20 12.59 0 132
LENGTH Segment length (mile) 0.75 1.35 0.10 30.91
AADT Average annual daily traffic (in thousand) 20.19 25.51 0.20 195.77
FREEWAY Freeway indicator: 1 if freeway, 0 otherwise 0.11 0.31 0 1
ARTERIAL Arterial indicator: 1 if arterial, 0 otherwise 0.39 0.49 0 1
COLLECTOR Collector indicator: 1 if collector, 0 otherwise 0.49 0.50 0 1
LOCALROAD Local road indicator: 1 if local road, 0 otherwise 0.01 0.11 0 1
MEDIAN Median barrier indicator: 1 if present, 0 otherwise 0.63 0.48 0 1
LANE1_2 1 or 2 lanes indicator: 1 if yes, 0 otherwise 0.56 0.50 0 1
LANE3_4 3 or 4 lanes indicator: 1 if yes, 0 otherwise 0.30 0.46 0 1
LANE5MORE 5 or more lanes indicator: 1 if yes, 0 otherwise 0.15 0.36 0 1
URBAN Urban indicator: 1 if in urban area; 0 otherwise 0.93 0.26 0 1
Intersection variables
CRASH Average annual crash count for each intersection 16.86 20.34 0 135
MAJ_AADT AADT on major approach (in thousand) 23.72 15.76 0.60 81.50
MIN_AADT AADT on minor approach (in thousand) 8.22 7.64 0.20 52.50
TRAFFIC_SIGNAL Traffic signal indicator: 1 if present, 0 otherwise 0.76 0.43 0 1
LEG3 3-Leg intersection indicator: 1 if yes, 0 otherwise 0.31 0.46 0 1
LEG4 4-Leg intersection indicator: 1 if yes, 0 otherwise 0.69 0.46 0 1
URBAN Urban indicator: 1 if in urban area; 0 otherwise 0.99 0.10 0 1
Tab le 4. Comparison results of model performance.
Non-integrated model Integrated model Difference between models
Measure Macro-level Micro-level Macro-level Micro-level Macro-level Micro-level
DIC 798.83 17524.30 753.84 17506.60 44.99 17.70
MAE 161.41 10.16 116.23 8.01 45.18 2.15
RMSE 242.28 24.43 197.30 18.73 44.98 5.70
for all road entities in the same TAD should be equal to the macro-level total crashes in
the specific TAD adjusted by socio-demographic factors. The segments and intersections
in the same TAD have the same constraint, which might affect the parameter estimation
of micro-level variables. In conclusion, the macro- and micro-level crash frequency mod-
els indeed support each other and the integrated model can consequently improve model
performance for crash prediction and analyses at the two levels.
5.2. Model results
The results of three models (i.e. two non-integrated models, one integrated model) for
crashes at both macro- and micro-levels are displayed in Tables 5–7. The results for two non-
integrated models only present the variables with significant effects on crash frequency at
either macro- level or micro-level. On the other hand, the integrated model results con-
sist of two components: (1) significant variables affecting the crash counts at the macro-
and micro-levels and (2) other socio-demographic variables at the macro-level adjusting
the relation of the expected crash counts between the two levels. All micro-level signif-
icant variables in the integrated model can also be found significant in the micro-level
non-integrated model. Meanwhile, the same significant socio-demographic variables can
be obtained from the integrated model and the non-integrated model for the macro-level.
All the significant variables are found to have consistent signs of parameter estimates in
12 Q.CAIETAL.
Tab le 5. Non-integrated model result at macro-level.
BCI
Variable Definition Mean S.D. 2.50% 97.50%
Intercept 3.44 0.06 3.56 3.31
DVMT Daily VMT 0.88 0.00 0.88 0.89
Segment-related variables
P_ARTERIAL Proportion of segment length of arterial 0.66 0.07 0.52 0.80
Intersection-related variables
INTER_DENS Number of intersections per mile 0.79 0.08 0.63 0.94
P_SINGAL Proportion of signalized intersections 0.48 0.05 0.37 0.58
Socio-demographic variables
P_AGE1524 Proportion of population aged 15–24 1.81 0.44 1.02 2.39
MEDIAN_INC Median household income 0.25 0.01 0.27 0.24
DIS_URBAN Distance to the nearest urban area 0.22 0.06 0.35 0.09
Land-use variables
P_TOU Proportion of tourist attraction, hotels, and motels area 8.25 3.17 2.62 13.58
Random effects
sd(θzone ) Standard deviation of θzone 0.30 0.03 0.21 0.35
sd(φzone ) Standard deviation of φzone 0.21 0.03 0.16 0.27
αzone Proportion of variability due to spatial correlation 0.59 0.05 0.46 0.68
Tab le 6. Non-integrated model result at micro level.
BCI
Variable Definition Mean S.D. 2.50% 97.50%
Segment
Intercept 3.34 0.09 3.47 3.21
AADT Average annual daily traffic 0.55 0.01 0.54 0.56
ARTERIAL Arterial indicator: 1 if arterial, 0 otherwise 0.27 0.03 0.22 0.34
LANG1_2 1 or 2 lanes indicator: 1 if yes, 0 otherwise 0.41 0.03 0.47 0.34
MEDIAN Median barrier indicator: 1 if present, 0 otherwise 0.11 0.03 0.05 0.16
URBAN Urban indicator: 1 if in urban area; 0 otherwise 0.73 0.07 0.61 0.86
Intersection
Intercept 8.18 0.08 8.35 7.99
MAJ_AADT AADT on major approach 0.75 0.01 0.74 0.76
MIN_AADT AADT on minor approach 0.29 0.01 0.27 0.31
TRAFFIC_SIGNAL Traffic signal indicator: 1 if present, 0 otherwise 0.45 0.04 0.38 0.53
LEG3 3-Leg intersection indicator: 1 if yes, 0 otherwise 0.51 0.04 0.59 0.42
Random effects
sd(θentity ) Standard deviation of φentity 2.73 0.17 2.40 3.07
sd(φentity ) Standard deviation of θentity 3.90 0.41 3.22 4.83
αentity Proportion of variability due to spatial correlation 0.79 0.02 0.75 0.83
the integrated and non-integrated models. Since the significant variables are from two lev-
els, multi-collinearity problems might exist. To test the extent of multi-collinearity in the
estimated models, the variance inflation factors (VIF) were calculated for all significant vari-
ables. A common rule of thumb is that multi-collinearity becomes problematic if the VIF
is greater than 5 (Menard 1995). The highest VIF (VIF of AADT for segment) in all models
was 1.81, which is far below the range of concern. While the results summarized in the
three tables, the discussions about the parameter estimates at the two levels focuses on
the integrated model which has better fit and more significant variables.
As shown in Table 7, totally nine micro-level variables are statistically significant for crash
frequency with 95% BCIs: five segment-related variables (i.e. AADT (average annual daily
traffic), functional class is arterial, number of lanes is 1 or 2, presence of median barrier,
TRANSPORTMETRICA A: TRANSPORT SCIENCE 13
Tab le 7. Integrated model result at the two levels.
BCI
Variable Definition Mean S.D. 2.50% 97.50%
Segment-related variables
Intercept 2.87 0.05 2.95 2.80
AADT Average annual daily traffic 0.48 0.01 0.47 0.49
ARTERIAL Arterial indicator: 1 if arterial, 0 otherwise 0.31 0.03 0.27 0.38
LANG1_2 1 or 2 lanes indicator: 1 if yes, 0 otherwise 0.43 0.03 0.48 0.36
MEDIAN Median barrier indicator: 1 if present, 0 otherwise 0.19 0.04 0.12 0.24
URBAN Urban indicator: 1 if in urban area; 0 otherwise 0.95 0.03 0.91 1.01
Intersection-related variables
Intercept 7.96 0.06 8.06 7.87
MAJ_AADT AADT on major approach 0.74 0.01 0.72 0.76
MIN_AADT AADT on minor approach 0.29 0.01 0.27 0.30
TRAFFIC_SIGNAL Traffic signal indicator: 1 if present, 0 otherwise 0.45 0.06 0.35 0.57
LEG3 3Leg intersection indicator: 1 if yes, 0 otherwise 0.54 0.04 0.62 0.46
Socio-demographic variables for adjusted factor
Intercept 3.62 0.07 3.49 3.75
P_AGE1524 Proportion of population aged 15–24 0.92 0.32 0.32 1.41
MEDIAN_INC Median household income 0.34 0.01 0.35 0.33
DIS_URBAN Distance to the nearest urban area 0.11 0.02 0.16 0.06
Random effects
sd(θentity ) Standard deviation of φentity 0.60 0.02 0.56 0.64
sd(φentity ) Standard deviation of θentity 0.92 0.04 0.87 1.01
sd(θzone ) Standard deviation of θzone 0.07 0.02 0.03 0.12
sd(φzone ) Standard deviation of φzone 0.10 0.03 0.04 0.14
αEntity Proportion of variability due to spatial correlation at micro level 0.61 0.01 0.58 0.63
αzone Proportion of variability due to spatial correlation at macro-level 0.57 0.15 0.25 0.81
urban area indicator) and four intersection-related variables (i.e. AADT on major approach,
AADT on minor approach, presence of traffic signal, number of legs is 3). The AADTs of seg-
ments and intersections are used as exposure variables of the crash frequency and expected
to have positive effects on crashes. Compared with other road types, arterials have partially
limited accesses with comparatively higher traffic volumes. Given the same road length, the
arterial is supposed to have more traffic interactions and conflicts. Unsurprisingly, a road
segment will have fewer crashes if it only has one or two lanes. The presence of median
barriers will increase crash counts on the road segments, which is consistent with the previ-
ous studies (Anastasopoulos et al. 2012). Furthermore, as a sign of high traffic volume, the
urban indicator suggested that segments would have less crashes if located in an urban
area. As for the intersections, a variable related to the intersection control type and a vari-
able about number of legs are found significant. Intersections with signalized controls are
more likely to have more crashes. The signal control is usually installed at intersections
with higher traffic volumes where more traffic interactions occur (Wang et al. 2016). Also,
the existence of dilemma zones can lead to more crashes at the signalized intersections
(Fu, Miranda-Moreno, and Saunier 2018;Wuetal.2017,2018a; Zhao and Khattak 2017).
More crashes are prone to happen at intersections with more intersecting legs (Wang and
Huang 2016). Hence, the 3-leg intersection indicator is negatively associated with the crash
frequency.
As for the macro-level socio-demographic variables, the proportion of population aged
15–24 is positive while the median household income and distance to the nearest urban
area are negatively associated with crash counts for the macro-level crash counts. The find-
ing about the young drivers is consistent with the well-known fact that young drivers prone
14 Q.CAIETAL.
to be involved in crashes due to the lack of driving experience (Huang, Abdel-Aty, and
Darwiche 2010;Wuetal.2018b). TADs having higher median household income would
experience less traffic crashes since drivers and passengers with higher income are more
likely to use seatbelts (Lerner et al. 2001) and their vehicles tend to be safer (Girasek and
Taylor 2010). As the distance of the TAD centroid from the nearest urban region increases,
total traffic crash risk is reduced – a sign of low traffic exposure in the suburban regions. It
should be noted that the proportion of tourist attraction, hotels, and motels area is found to
have a significant positive relation with the crashes at the macro-level in the non-integrated
model (Table 5). It may imply that tourists are exposed more to traffic crashes as they are
not familiar with the local roadways and rules. The result is also in line with the previous
study by Lee, Abdel-Aty, and Jiang (2015). However, this variable is not significant in the
integrated model.
The two random terms due to the spatial autocorrelation and unobserved heterogeneity
are significant for crash frequency of both macro- and micro-levels. The proportions of vari-
ability due to the spatial autocorrelation at the macro- and micro-levels are 0.65 and 0.6,
respectively, indicating the importance to consider the spatial effects in crash frequency
analysis. Compared with the non-integrated model, the standard deviations of the spa-
tial autocorrelation and unobserved heterogeneity for the crash frequency at the macro-
and micro-levels are much smaller in the integrated model, which indicates that consid-
ering the spatial interaction between the two levels can reduce the effects of random
terms.
6. Integrated hotspots identification analysis
One possible application of the proposed integrated model is to identify crash hotspot,
which is a top priority for safety treatment. The crash hotspot should not be simply the
one with the highest crash frequency; instead, it should be the one that experiences more
crashes than similar sites as a result of site-specific deficiency (Xie et al. 2017).Apoten-
tial for safety improvement (PSI) was adopted in this study to identify hotspots, which
is defined as the expected crash frequency at the sites of interest minus the expected
crashes in the similar sites (Aguero-Valverde and Jovanis 2010). The spots with higher
PSI are expected to have more reduced crashes after the implementation of the treat-
ments. Based on the integrated spatial model, the PSIs for the two levels can be calcu-
lated as
PREentity
m=γm×(βsegxseg
m+log(lengthseg
m)) +(1γm)×(βinterxinter
m), (14)
PSIentity
m=λentity
mPREentity
m, (15)
PREzone
i=
k
m=1
PREentity
mwmi exp(βzonexzone
i), (16)
PSIzone
i=λzone
iPREzone
i, (17)
where PREentity
mand PREzone
iare the predicted number of crashes at the micro- and macro-
levels while λentity
m,λzone
iare the expected number of crashes for the sites of interest at
TRANSPORTMETRICA A: TRANSPORT SCIENCE 15
Figure 3. Comparisons of hot TADs identified by PSI at macro and micro-levels.
the two levels. It should be noted that the predicted number of crashes also refers to the
expected crashes in the similar sites without considering the unobserved heterogeneity for
each specific site, which means the same predicted values could be obtained if given the
same explanatory variables. On the other hand, the expected number of crashes for the
sites of interest considers the unobserved heterogeneity of the specific sites by including
the spatial and unstructured random terms. PSIentity
mand PSIzone
iare the micro and macro
PSIs . The coefficients and random terms in the equations can be obtained by Bayesian
inference in the estimated model. The spots with positive PSIs could be considered as haz-
ardous and should have the potential to be improved. However, given time and budget
constraints, it is more efficient to identify hotspots which have the priority to implement
treatments. In our study, all sites at the macro- and micro-levels are classified into three
categories based on the calculated PSIs: hot (H), warm (W), and cold (C) sites. Hot sites
are defined as those with top 10% PSIs, warm sites refer to be sites with positive PSIs but
not the top 10%, and the remaining sites are cold sites. It should be noted that 10% was
commonly used as the threshold to identify hotspots (Cai, Abdel-Aty, and Lee 2017; Cheng
and Washington 2008), and it can be increased or decreased depending on researchers’
needs.
The macro- and micro-level PSIs should recognize transportation safety problems with
different aspects. In favor of providing an equivalent comparison of PSIs at the macro- and
micro-levels, the PSIs at the micro-level are aggregated into the macro-level. Figure 3(a)
shows the difference between the hot TADs identified by PSIs based on the macro-level
(PSI-TAD) and sum of PSIs based on the micro-level (PSI-SUM). In summary, five (6.41%)
TADs were identified as hotspots by both the PSI-TAD and PSI-SUM, three (3.85%) TADs
were identified by PSI-TAD only, and three (3.85%) TADs were identified by PSI-SUM only.
As indicated in Figure 3(a), spatial clustering of high-risk TADs can be observed. Most of
16 Q.CAIETAL.
Figure 4. Spatial distribution of hot TADs based on the integrated classification.
the identified hot TADs are located in the downtown Orlando area, especially hot TADs
identified by both PSI-TAD and PSI-SUM. Figure 3(b) illustrates the difference between the
ranks by PSI-TAD and PSI-SUM. The X-andY-axis show the rank in descending order of
the PSI-TAD and PSI-SUM. The red line is the 45° reference line and the points on the red
line represent that same ranking results can be obtained based on PSI-TAD and PSI-SUM.
As shown in Figure 3(b), most of points are plotted around the reference line indicating
that similar ranking results are obtained based on the PSIs at the two levels. However,
some TADs have clearly different ranking results based on PSI-TAD and PSI-SUM, revealing
that the hotspots identification based on single level may result in largely ignoring certain
spots with excess crash frequency studies (Abdel-Aty et al. 2016; Huang et al., ‘Macro and
Micro Models,’ 2016). Hence, it is necessary to develop an integrated approach to identify
hotspots to overcome the shortcomings of individual identification analysis.
At the macro-level, an integrated classification is suggested based on TADs to sup-
port policy-making and long-term transportation planning. Given that three categories are
adopted for the classification at the two levels, there are nine candidate combination classi-
fications: HH, HW, HC, WH, WW, WC, CH, CW, and CC. The former letter represents the safety
at the macro-level while the latter letter denotes the combined crash risk based on the
micro-level. For example, the ‘HH’ refers to the TADs with a serious safety problem at both
macro- and micro-levels. Table 8summarizes the number of TADs by the integrated cate-
gory. In addition to the integrated category based on the integrated model, the category
TRANSPORTMETRICA A: TRANSPORT SCIENCE 17
Tab le 8. TADs and road entities by integrated category.
Sites Category HH HW WH HC CH WW WC C W CC
Non-integrated
model
TAD Counts62 101 30 41915
Percentage 7.69 2.56 1.28 0.00 1.28 38.46 5.13 24.36 19.23
Intersection Counts 21 40 119 27 41 181 204 119 130
Percentage 2.38 4.54 13.49 3.06 4.65 20.52 23.13 13.49 14.74
Segments Counts 68 60 263 50 65 487 428 468 545
Percentage 2.79 2.47 10.81 2.05 2.67 20.01 17.58 19.23 22.39
Integrated model TAD Counts 5 3 3 0 0 49 1 0 17
Percentage 6.41 3.85 3.85 0.00 0.00 62.82 1.28 0.00 21.79
Intersection Counts 23 74 142 1 26 356 84 114 62
Percentage 2.61 8.39 16.10 0.11 2.95 40.36 9.52 12.93 7.03
Segments Counts 83 146 295 5 34 913 249 397 312
Percentage 3.41 6.00 12.12 0.21 1.40 37.51 10.23 16.31 12.82
Difference
between
models
TAD Coun ts 1 120119 3 19 2
Percentage 1.28 1.29 2.57 0.00 1.28 24.36 3.85 24.36 2.56
Intersection Counts 234 23 26 15 175 120 5 68
Percent age 0.23 3.85 2.61 2.95 1.70 19.84 13.61 0.56 7.71
Segments Counts 15 86 32 45 31 426 179 71 233
Percent age 0.62 3.53 1.31 1.84 1.27 17.50 7.35 2.92 9.57
results based on the non-integrated models are provided. Compared to the results based
on the non-integrated model, the integrated model could identify more ‘WW’ categories
and less ‘WC’ and ‘CW’ categories, which means more consistent categories could be found
from the two levels based on the integrated model. There are five (6.41%) TADs are classi-
fied as ‘HH’ which are the top priority for safety treatments since they have highest safety
risks at the two levels from the integrated model. The integrated classification result based
on the integrated model is illustrated in Figure 5. Since the number of ‘HW’ and ‘WH’ TADs
are small, they are merged together for the purpose of brevity. Hence, five categories are
presented, i.e. ‘HH’, ‘HW/WH’, ‘WW’, ‘WC’, and ‘CC’. As demonstrated in Figure 4, spatial clus-
tering of high-risk zones can be observed. Special attention should be paid in Downtown
Orlando since most of zones with high crash risk are located in this area. The zones with
moderate crash risk cluster in the north corner of the study area while the safe zones are
rather spatially isolated.
Beside integrated classification at the macro-level, an integrated classification analysis
is also conducted at the micro-level to help provide appropriate engineering treatments
to reduce crashes in specific road entities. Similar to the macro-level integration approach,
all sites (segments and intersections) are classified into nine categories including two scale
groups (micro and macro) and three risk levels (hot, warm, and cold). Hence, for example,
the ‘HH’ indicates that a road entity has safety problem and it is located in a TAD with serious
safety issues. For such road entity, both appropriate engineering treatments and enforce-
ment strategies should be implemented. As summarized in Table 8, most road entities with
high risk are in the dangerous area. Besides, more ‘warm’ intersections and segments could
be found in ‘warm’ TADs based on the integrated model. Moreover, Figure 5presents which
road entities should be targeted in downtown Orlando since the area has most zones of
interest based on the integrated model.
18 Q.CAIETAL.
Figure 5. Spatial distribution of road entities based on the integrated classification in Downtown
Orlando.
7. Conclusion
The crash frequency modeling analysis plays an essential role in transportation safety as it
can estimate the effects of macro- and micro-level factors on safety and identify hotspots,
which have safety issues. This study formulated and estimated a Bayesian integrated spatial
model to analyze crash frequency at the macro- and micro-levels, simultaneously. Based
on the spatial interaction between zones and road entities, the expected crash counts at
the macro- and micro-levels were linked by an adjustment factor. The adjustment factor
was estimate by using a set of macro-level socio-demographic variables, which indicates
how many more crashes occur at the macro-level given the same road network but with
the different socio-demographic characteristics. Besides the spatial interaction, the spatial
autocorrelations at zones and road entities were considered in the model. Especially, the
spatial autocorrelation at micro-level was considered for different types of road entities (i.e.
segments and intersections) with a joint structure. Two independent non-integrated mod-
els were also estimated for comparison. The crashes that occurred on both segments and
intersections in Orlando, Florida during 2010–2012 were selected for the empirical anal-
ysis. Then, the selected crashes were aggregated at both macro- and micro-levels and a
comprehensive set of exogenous variables from the two levels were selected for the model
estimation.
The results of the integrated model clearly highlighted the existence of spatial inter-
action between the macro- and micro-level crash counts and confirmed the benefit of
integrating modeling analysis of crash counts for the two levels. The comparison results
indicated that the integrated model significantly outperformed non-integrated model at
the macro-level while the integrated model provided a slightly better model performance
for micro-level crash frequency analysis. The integrated model provided a combination of
significant variables from both micro- and macro-levels including segment-based variables
(e.g. AADT, arterial indicator, 1 or 2 lanes indicator), intersection-based variables (e.g. AADT
on major and minor approaches, traffic signal control indicator), and TAD-based socioeco-
nomic variables (e.g. proportion of population aged 15–24, median household income).
The identification of significant macro-level variables can help undertake planning pro-
cess to enhance transportation safety while we can suggest engineering solution to reduce
traffic crashes based on micro-level contributing factors. Therefore, the proposed model
TRANSPORTMETRICA A: TRANSPORT SCIENCE 19
can be employed as a useful tool that links the transportation safety planning and traffic
engineering countermeasures.
This study further contributed to the literature by proposing a novel integrated method
to identify hotspots of crashes at both macro- and micro-levels. The PSI was adopted as
a measure to identify the hotspots for the two levels. The macro-level hotspot identifica-
tion can detect zones with area-wide planning-level safety problems while the micro-level
approach is capable of identifying specific road entities with high risks. Since the sole
hotspot identification may ignore certain spots with excess crash frequency, an integrated
hotspot identification approach was suggested. Both TADs and road entities were classi-
fied into nine categories with the consideration of two levels (macro- and micro-levels)
and three crash risk levels (hot, warm, and cold). With the integrated hotspot identifi-
cation approach, better classification results can be obtained for both TADs and road
entities with a comprehensive transportation planning and traffic engineering perspec-
tives.
In this study, only the road entities with traffic data were included. It is suggested to
extend the study by including more road entities that were excluded due to the missing
traffic data, especially for local roads. Then, the proposed model could be further inves-
tigated since more complete data would be included at the macro-level. In addition, the
integrated model was calibrated by using the data of 3 years. It could be worthwhile if the
developed model could be further validated by using more recent data once it becomes
available.
Acknowledgement
The opinions, findings and conclusions expressed in this paper are those of the authors and not
necessarily those of the Florida Department of Transportation.
Disclosure statement
No potential conflict of interest was reported by the authors.
Funding
The authors would like to thank the Florida Department of Transportation for funding this study.
References
Abdel-Aty, M., J. Lee, N. Eluru, Q. Cai, S. Al Amili, and S. Alarifi. 2016.Enhancing and Generalizing the Two-
Level Screening Approach Incorporating the Highway Safety Manual (HSM) Methods, Phase 2. Florida
Department of Transportation.
Abdel-Aty, M., J. Lee, C. Siddiqui, and K. Choi. 2013. “Geographical Unit Based Analysis in the Context
of Transportation Safety Planning.” Transportation Research Part A: Policy and Practice 49: 62–75.
Abdel-Aty, M., and A. E. Radwan. 2000. “Modeling Traffic Accident Occurrence and Involvement.”
Accident Analysis & Prevention 32: 633–642.
Aguero-Valverde, J., and P. P. Jovanis. 2010. “Spatial Correlation in Multilevel Crash Frequency Mod-
els: Effects of Different Neighboring Structures.” Transportation Research Record: Journal of the
Transportation Research Board 2165: 21–32.
Anastasopoulos, P. C., and F. L. Mannering. 2009. “A Note on Modeling Vehicle Accident Frequencies
with Random-Parameters Count Models.” Accident Analysis & Prevention 41: 153–159.
20 Q.CAIETAL.
Anastasopoulos, P. C., F. L. Mannering, V. N. Shankar, and J. E. Haddock. 2012. “A Study of Factors
Affecting Highway Accident Rates Using the Random-parameters Tobit Model.” Accident Analysis
& Prevention 45: 628–633.
Besag, J., J. York, and A. Mollié. 1991. “Bayesian Image Restoration, with two Applications in Spatial
Statistics.” Annals of the Institute of Statistical Mathematics 43: 1–20.
Cai, Q., M. Abdel-Aty, and J. Lee. 2017. “Macro-level Vulnerable Road Users Crash Analysis: a Bayesian
Joint Modeling Approach of Frequency and Proportion.” Accident Analysis & Prevention 107: 11–19.
Cai, Q., M. Abdel-Aty, J. Lee, and N. Eluru. 2017. “Comparative Analysis of Zonal Systems for Macro-
Level Crash Modeling.” Journal of Safety Research 61: 157–166.
Cai, Q., J. Lee, N. Eluru, and M. Abdel-Aty. 2016. “Macro-level Pedestrian and Bicycle Crash Analysis:
Incorporating Spatial Spillover Effects in Dual State Count Models.” Accident Analysis & Prevention
93: 14–22.
Cheng, W., and S. Washington. 2008. “New Criteria for Evaluating Methods of Identifying Hot Spots.”
Transportation Research Record: Journal of the Transportation Research Board 2083: 76–85.
Dong, C., D. B. Clarke, X. Yan, A. Khattak, and B. Huang. 2014. “Multivariate Random-Parameters Zero-
Inflated Negative Binomial Regression Model: An Application to Estimate Crash Frequencies at
Intersections.” Accident Analysis & Prevention 70: 320–329.
El-Basyouny, K., and T. Sayed. 2009. “Collision Prediction Models Using Multivariate Poisson-
Lognormal Regression.” Accident Analysis & Prevention 41 (4): 820–828.
FHWA, Census Transportation Planning Products (CTPP). 2011.2010 Census Traffic Analysis Zone
Program MAF/TIGER Partnership Software Participant Guidelines.
Fu, T., L. Miranda-Moreno, and N. Saunier. 2018. “A Novel Framework to Evaluate Pedestrian Safety at
Non-signalized Locations.” Accident Analysis & Prevention 111: 23–33.
Garber, N., and A. Ehrhart. 2000. “Effect of Speed, Follow, and Geometric Characteristics on Crash
Frequency for Two-Lane Highways.” Transportation Research Record: Journal of the Transportation
Research Board 1717: 76–83.
Girasek, D. C., and B. Taylor. 2010. “An Exploratory Study of the Relationship between Socioeconomic
Status and Motor Vehicle Safety Features.” Traffic injury prevention 11 (2): 151–155.
Hadayeghi, A., A. S. Shalaby, and B. N. Persaud. 2010. “Development of Planning Level Transportation
Safety Tools Using Geographically Weighted Poisson Regression.” Accident Analysis & Prevention 42:
676–688.
Houston, R. W. 1998. “The Transportation Equity Act for the 21st Century. Institute of Transportation
Engineers.” ITE Journal 68 (7): 45.
Huang, H., M. Abdel-Aty, and A. Darwiche. 2010. “County-level Crash Risk Analysis in Florida: Bayesian
Spatial Modeling.” Transportation Research Record: Journal of the Transportation Research Board
2148: 27–37.
Huang, H., B. Song, P. Xu, Q. Zeng, J. Lee, and M. Abdel-Aty. 2016. “Macro and Micro Models for Zonal
Crash Prediction with Application in Hot Zones Identification.” Journal of Transport Geography 54:
248–256.
Huang, H., Q. Zeng, X. Pei, S. C. Wong, and P. Xu. 2016. “Predicting Crash Frequency Using an Optimized
Radial Basis Function Neural Network Model.” Transportmetrica A: Transport Science 12 (4): 330–345.
Lee, J., M. Abdel-Aty, and Q. Cai. 2017. “Intersection Crash Prediction Modeling with Macro-Level Data
from Various Geographic Units.” Accident Analysis & Prevention 102: 213–226.
Lee, J., M. Abdel-Aty, K. Choi, and H. Huang. 2015. “Multi-level Hot Zone Identification for Pedestrian
Safety.” Accident Analysis & Prevention 76: 64–73.
Lee, J., M. Abdel-Aty, and X. Jiang. 2014. “Development of Zone System for Macro-Level Traffic Safety
Analysis.” Journal of Transport Geography 38: 13–21.
Lee, J., M. Abdel-Aty, and X. Jiang. 2015. “Multivariate Crash Modeling for Motor Vehicle and Non-
Motorized Modes at the Macroscopic Level.” Accident Analysis & Prevention 78: 146–154.
Lee, J., B. Nam, and M. Abdel-Aty. 2015. “Effects of Pavement Surface Conditions on Traffic Crash
Severity.” Journal of Transportation Engineering 141 (10): 04015020.
Lerner, E. B., D. V. Jehle, A. J. Bilittier, R. M. Moscati, C. M. Connery, and G. Stiller. 2001. “The Influence of
Demographic Factors on Seatbelt Use by Adults Injured in Motor Vehicle Crashes.” Accident Analysis
& Prevention 33: 659–662.
TRANSPORTMETRICA A: TRANSPORT SCIENCE 21
Lord, D., and F. Mannering. 2010. “The Statistical Analysis of Crash-Frequency Data: A Review and
Assessment of Methodological Alternatives.” Transportation Research Part A: Policy and Practice 44:
291–305.
Menard, S. 1995.Applied Logistic Regression Analysis: Sage University Series on Quantitative Applications
in the Social Sciences. Thousand Oaks, CA: Sage.
Miaou, S. P., J. J. Song, and B. K. Mallick. 2003. “Roadway Traffic Crash Mapping: A Space-Time Modeling
Approach.” Journal of Transportation and Statistics 6: 33–58.
Park, J., M. Abdel-Aty, J. Lee, and C. Lee. 2015. “Developing Crash Modification Functions to Assess
Safety Effects of Adding Bike Lanes for Urban Arterials with Different Roadway and Socio-Economic
Characteristics.” Accident Analysis & Prevention 74: 179–191.
Quddus, M. A. 2008. “Modelling Area-Wide Count Outcomes with Spatial Correlation and Heterogene-
ity: An Analysis of London Crash Data.” Accident Analysis & Prevention 40: 1486–1497.
Shankar, V., F. Mannering, and W. Barfield. 1995. “Effect of Roadway Geometrics and Environmental
Factors on Rural Freeway Accident Frequencies.” Accident Analysis & Prevention 27: 371–389.
Siddiqui, C., and M. Abdel-Aty. 2012. “Nature of Modeling Boundary Pedestrian Crashes at Zones.”
Transportation Research Record: Journal of the Transportation Research Board 2299: 31–40.
Taylor, M. C., A. Baruya, and J. V. Kennedy. 2002.The Relationship Between Speed and Accidents on Rural
Single-Carriageway Roads (TRL Report 511).
US Congress. 2012.Moving Ahead for Progress in the 21st Century Act. Washington, DC.
U.S. Department of Transportation. 2015.The Fixing America’s Surface Transportation Act. Accessed
April 24, 2017. https://www.transportation.gov/fastact.
Wang, X., and M. Abdel-Aty. 2008. “Modeling Left-Turn Crash Occurrence at Signalized Intersections
by Conflicting Patterns.” Accident Analysis & Prevention 40: 76–88.
Wang, L., M. Abdel-Aty, and J. Lee. 2017. “Safety Analytics for Integrating Crash Frequency and Real-
Time Risk Modeling for Expressways.” Accident Analysis & Prevention 104: 58–64.
Wang, L., M. Abdel-Aty, Q. Shi, and J. Park. 2015. “Real-time Crash Prediction for Expressway 19
Weaving Segments.” Transportation Research Part C: Emerging Technologies 61: 1–10.
Wang, Z., Q. Cai, B. Wu, L. Zheng, and Y. Wang. 2016. “Shockwave-based Queue Estimation Approach
for Undersaturated and Oversaturated Signalized Intersections Using Multi-Source Detection
Data.” Journal of Intelligent Transportation Systems 21 (3): 1–12.
Wang, J., and H. Huang. 2016. “Road Network Safety Evaluation Using Bayesian Hierarchical Joint
Model.” Accident Analysis & Prevention 90: 152–158.
Wang, X., J. Yuan, G. G. Schultz, and S. Fang. 2018. “Investigating the Safety Impact of Roadway
Network Features of Suburban Arterials in Shanghai.” Accident Analysis & Prevention 113: 137–148.
Washington, S., I. Van Schalkwyk, S. Mitra, M. Mayer, E. Dumbaugh, and M. Zoll. 2006.NCHRP Report
546: Incorporating Safety into Long-Range Transportation Planning. Washington, DC: Transportation
Research Board.
Wong, S. C., N. N. Sze, and Y. C. Li. 2007. “Contributory Factors to Traffic Crashes at Signalized
Intersections in Hong Kong.” Accident Analysis & Prevention 39: 1107–1113.
Wu, Y., M. Abdel-Aty, J. Park, and R. M. Selby. 2018a. “Effects of Real-Time Warning Systems on Driv-
ing Under Fog Conditions Using an Empirically Supported Speed Choice Modeling Framework.”
Transportation Research Part C: Emerging Technologies 86: 97–110.
Wu, Y., M. Abdel-Aty, J. Park, and R. M. Selby. 2018b. “Effects of Real-Time Warning Systems on Driving
Under fog Conditions Using Real-Time Data.” Transportation Research Part C: Emerging Technologies
87: 11–25.
Wu, Y., Y. Ding, M. Abdel-Aty, B. Jia, and X. Yan. 2017. Comparison of proposed countermeasures for
dilemma zone at signalized intersections based on cellular automata simulations. Accident Analysis
& Prevention.
Xie, K., K. Ozbay, A. Kurkcu, and H. Yang. 2017. “Analysis of Traffic Crashes Involving Pedestrians Using
Big Data: Investigation of Contributing Factors and Identification of Hotspots.” Risk Analysis 37 (8):
1459–1476.
Xu, P., and H. Huang. 2015. “Modeling Crash Spatial Heterogeneity: Random Parameter Versus
Geographically Weighting.” Accident Analysis & Prevention 75: 16–25.
22 Q.CAIETAL.
Yang, Z., M. L. Franz, S. Zhu, J. Mahmoudi, A. Nasri, and L. Zhang. 2018. “Analysis of Washington, DC
Taxi Demand Using GPS and Land-use Data.” Journal of Transport Geography 66: 35–44.
Yasmin, S., and N. Eluru. 2016. “Latent Segmentation Based Count Models: Analysis of Bicycle Safety
in Montreal and Toronto.” Accident Analysis & Prevention 95: 157–171.
Ye, X., R. M. Pendyala, S. P. Washington, K. Konduri, and J. Oh. 2009. “A Simultaneous Equations Model
of Crash Frequency by Collision Type for Rural Intersections.” Safety Science 47: 443–452.
Yu, R., and M. Abdel-Aty. 2013. “Multi-level Bayesian Analyses for Single-and Multi-Vehicle Freeway
Crashes.” Accident Analysis & Prevention 58: 97–105.
Zeng, Q., and H. Huang. 2014. “A Stable and Optimized Neural Network Model for Crash Injury Severity
Prediction.” Accident Analysis & Prevention 73: 351–358.
Zhao, S., and A. J. Khattak. 2017. “Factors Associated with Self-Reported Inattentive Driving at
Highway-Rail Grade Crossings.” Accident Analysis & Prevention 109: 113–122.
... The micro-level analysis is effective in identifying and solving safety issues at a particular site; however, it becomes more difficult to capture spatial trends and problems in a larger area. Compared to micro-level safety analysis, macro-level safety analysis can be more effective in order to identify road safety issues in a larger area, and hence, is more beneficial in establishing a long-term policy to improve road safety (Cai et al., 2019). Secondly, the association between the built environment and the type of collision (e.g., head-on, rear-end, hit from the side, runoff, etc.) in the Indian context has not been extensively explored by past researchers. ...
... Few literatures investigated the influence of multilevel analysis on accident counts. Cai et al. [34] developed a Bayesian integrated spatial model to analyze accident frequency at the macro-and micro-levels between district and road entities (i.e., segments and intersections) simultaneously. The results indicated that the model could simultaneously identify both micro-and macro-level factors contributing to the accident occurrence and with higher performance. ...
Article
Understanding the effect of relevant variables on secondary crashes could provide reliable traffic intervention evidence for decision makers, which ultimately would contribute to the prevention of secondary crashes. Road crashes present obvious spatiotemporal heterogeneity, and previous studies (e.g., geographically weighted regression [GWR] based method) only analyzed the spatial heterogeneity without considering the timescale of related variables. This study proposes a geographically and temporally weighted regression model (GTWR), which adds a time variation to the conventional GWR model. The GTWR model is used to explore the spatiotemporal effects of relevant variables on the secondary crashes on expressways. The empirical study using expressways in Anhui, China, illustrates that the density of rainy days, snow and ice, and occupied middle lanes are negatively associated with secondary crashes. Then the temporal variation characteristics of each influencing factor at different timescales (e.g., weekdays, weekends, and time of day such as night, morning, afternoon, and evening) are analyzed, which reveals the spatial distribution characteristics of the influencing factors in each time period. The finding indicates that the explanatory variables have heterogeneous effects on the frequency of secondary crashes. The model comparison further demonstrates that the proposed GTWR model outperforms the ordinary least squares (OLS) and GWR methods in data fitting and spatiotemporal modeling. The research findings can aid in prevention of secondary crashes on freeways. © 2023 Taylor & Francis Group, LLC and The University of Tennessee.
Article
Full-text available
Road traffic crashes pose a significant challenge worldwide, necessitating increased efforts to reduce them and promote sustainable transport systems. This study aimed to investigate spatiotemporal road traffic crashes and their causes in the State of Qatar by identifying hot spots of crashes and exploring whether they were primarily attributed to behavioural practices and/or the geometrical design of roads and intersections. The study employed various methods, including Time-Space Cube analysis, Geographically Weighted Regression (GWR), Emerging Hot Spot analysis, and Spatial Autocorrelation analysis, with historical traffic crash data from 2015 and 2019. The findings indicated that crashes were mainly concentrated in the central-eastern region of Qatar and are related to driver behaviour. The analysis also revealed that crashes during the weekdays in 2019 were more strongly clustered than in 2015, suggesting a probable systematic cause of crashes. The results provide valuable information for policymakers to target high-incidence locations, prioritize interventions and develop more effective measures and policies to reduce crashes and promote a sustainable transportation system in Qatar. Overall, this study highlights the importance of continued research and policy development in this area and could potentially be applicable and transferable to similar regions.
Technical Report
Full-text available
This project aims at developing a novel methodology to identify traffic safety hotspots and hot zones for at the macroscopic and microscopic levels. In order to achieve these objectives, the following tasks were performed. The research team followed the HSM screening procedure and extended it to the macroscopic level. TAZs (Traffic analysis zones) have been most widely used as a spatial unit for macroscopic analysis; however, TAZs have two disadvantages: small size in urban areas and high percentage of zonal boundary crashes. Thus, we have suggested two ways to overcome this issue. The first way is to develop a new study unit–Traffic safety analysis zones (TSAZs), created by aggregating existing TAZs with similar crash rates. The second way is to apply a larger geographic unit such as TADs (Traffic analysis districts) or counties. We explored traffic safety not for TAZs only but also for TSAZs, TADs, and counties. The research team developed a series of SPFs (Safety performance functions) both at the macro-level and micro-level for 17 crash types. At the macro-level, overall, 204 SPFs were developed based on SWTAZs (Statewide TAZs), TSAZs, TADs, and counties. The research team has found various contributing factors for each traffic crash type at the macro-level. At the micro-level, overall, 404 Floridaspecific SPFs were estimated for 13 segments and 16 intersection facility types. Before the research team proceeded to the screening analysis, we performed a grid structure analysis to identify the best geographic units. The results showed that SWTAZs are the optimal zone system for analyzing non-motorized crashes such as pedestrian and bicycle crashes. On the other hand, TADs are found to be the best geographic unit for all other crash types. Subsequently, screening analysis was conducted at the two-levels using PSI (Potential for Safety Improvement) and ranked. Two stage screening could be suggested as a simple way to identify high risk locations. The screening results from the two-levels were integrated, and all the results were provided in Excel spreadsheets for the convenient application of practitioners. It is intended that the results of the project would provide a comprehensive perspective on appropriate traffic safety plans and help practitioners screen and rank any area, segment, or intersection in the state.
Article
Taxis remain a key asset for urban mobility despite the tremendous growth of modern mobility-on-demand service providers such as Uber and Lyft. A fundamental understanding of the factors that impact the taxi demand is essential for planning an effective multi-modal transportation system, and can also shed lights on new on-demand services. This study addressed a gap in literature by investigating the correlation between demand for taxi, land use patterns, and accessibility to other modes using detailed GPS and GIS information collected from the Washington D.C. metropolitan area. The results of the models showed a strong link between demand for taxi, land use patterns, and accessibility to other modes. Mixed land use did not show a strong correlation with taxi demand. The study also found that the taxi mode is likely to complement metro trips, but compete with bus trips, although both of these modes of travel are considered public transit. Airport trips were found to be the most important source for taxi travel. These findings were further supported by the time-of-day and seasonality analysis.
Article
To find crash contributing factors, there have been numerous crash frequency and real-time safety studies, but such studies have been conducted independently. Until this point, no researcher has simultaneously analyzed crash frequency and real-time crash risk to test whether integrating them could better explain crash occurrence. Therefore, this study aims at integrating crash frequency and real-time safety analyses using expressway data. A Bayesian integrated model and a non-integrated model were built: the integrated model linked the crash frequency and the real-time models by adding the logarithm of the estimated expected crash frequency in the real-time model; the non-integrated model independently estimated the crash frequency and the real-time crash risk. The results showed that the integrated model outperformed the non-integrated model, as it provided much better model results for both the crash frequency and the real-time models. This result indicated that the added component, the logarithm of the expected crash frequency, successfully linked and provided useful information to the two models. This study uncovered few variables that are not typically included in the crash frequency analysis. For example, the average daily standard deviation of speed, which was aggregated based on speed at 1-min intervals, had a positive effect on crash frequency. In conclusion, this study suggested a methodology to improve the crash frequency and real-time models by integrating them, and it might inspire future researchers to understand crash mechanisms better.
Article
Various geographic units have been used in macro-level modeling. Amongst these units, traffic analysis zones (TAZs) have been broadly employed in many macroscopic safety studies. Nevertheless, no studies questioned the validity of TAZs for crash analysis at the macro-level crash modeling. In this study, we point out several possible problems of TAZs as spatial units for macroscopic safety studies. Current TAZs with homogenous crash rates were combined into new single zones. Then we created ten new zonal systems by different zone aggregation levels. The optimal zonal scale for traffic safety analysis zones (TSAZ) was determined using the Brown-Forsythe test. It was found that the zone system with about 1:2 aggre-gation was the optimal zone system for macroscopic safety modeling. Thus we develop what we call traffic safety analysis zones (TSAZs) that has the potential of reducing several possible problems of TAZs. Also it was shown that TSAZ based models had better fit compared to TAZ based models.
Article
A wide array of spatial units has been explored in macro-level modeling. With the advancement of Geographic Information System (GIS) analysts are able to analyze crashes for various geographical units. However, a clear guideline on which geographic entity should be chosen is not present. Macro level safety analysis is at the core of transportation safety planning (TSP) which in turn is a key in many aspects of policy and decision making of safety investments. The preference of spatial unit can vary with the dependent variable of the model. Or, for a specific dependent variable, models may be invariant to multiple spatial units by producing a similar goodness-of-fits. In this study three different crash models were investigated for traffic analysis zones (TAZs), block groups (BGs) and census tracts (CTs) of two counties in Florida. The models were developed for the total crashes, severe crashes and pedestrian crashes in this region. The primary objective of the study was to explore and investigate the effect of zonal variation (scale and zoning) on these specific types of crash models. These models were developed based on various roadway characteristics and census variables (e.g., land use, socioeconomic , etc.). It was found that the significance of explanatory variables is not consistent among models based on different zoning systems. Although the difference in variable significance across geographic units was found, the results also show that the sign of the coefficients are reasonable and explainable in all models. Key findings of this study are, first, signs of coefficients are consistent if these variables are significant in models with same response variables, even if geographic units are different. Second, the number of significant variables is affected by response variables and also geographic units. Admittedly, TAZs are now the only traffic related zone system, thus TAZs are being widely used by transportation planners and frequently utilized in research related to mac-roscopic crash analysis. Nevertheless, considering that TAZs are not delineated for traffic crash analysis but they were designed for the long range transportation plans, TAZs might not be the optimal zone system for traffic crash modeling at the macroscopic level. Therefore , it recommended that other zone systems be explored for crash analysis as well.
Article
There have been great efforts to develop traffic crash prediction models for various types of facilities. The crash models have played a key role to identify crash hotspots and evaluate safety countermeasures. In recent, many macro-level crash prediction models have been developed to incorporate highway safety considerations in the long-term transportation planning process. Although the numerous macro-level studies have found that a variety of demographic and socioeconomic zonal characteristics have substantial effects on traffic safety, few studies have attempted to coalesce micro-level with macro-level data from existing geographic units for estimating crash models. In this study, the authors have developed a series of intersection crash models for total, severe, pedestrian, and bicycle crashes with macro-level data for seven spatial units. The study revealed that the total, severe, and bicycle crash models with ZIP-code tabulation area data performs the best, and the pedestrian crash models with census tract-based data outperforms the competing models. Furthermore, it was uncovered that intersection crash models can be drastically improved by only including random-effects for macro-level entities. Besides, the intersection crash models are even further enhanced by including other macro-level variables. Lastly, the pedestrian and bicycle crash modeling results imply that several macro-level variables (e.g., population density, proportions of specific age group, commuters who walk, or commuters using bicycle, etc.) can be a good surrogate exposure for those crashes.
Article
Although many researchers have estimated crash modification factors (CMFs) for specific treatments (or countermeasures), there is a lack of studies that explored the heterogeneous effects of roadway characteristics on crash frequency among treated sites. Generally, the CMF estimated by before-after studies represents overall safety effects of the treatment in a fixed value. However, as each treated site has different roadway characteristics, there is a need to assess the variation of CMFs among the treated sites with different roadway characteristics through crash modification functions (CMFunctions). The main objective of this research is to determine relationships between the safety effects of adding a bike lane and the roadway characteristics through (1) evaluation of CMFs for adding a bike lane using observational before-after with empirical Bayes (EB) and cross-sectional methods, and (2) development of simple and full CMFunctions which are describe the CMF in a function of roadway characteristics of the sites. Data was collected for urban arterials in Florida, and the Florida-specific full SPFs were developed. Moreover, socioeconomic parameters were collected and included in CMFunctions and SPFs (1) to capture the effects of the variables that represent volume of bicyclists and (2) to identify general relationship between the CMFs and these characteristics. In order to achieve better performance of CMFunctions, data mining techniques were used. The results of both before-after and cross-sectional methods show that adding a bike lane on urban arterials has positive safety effects (i.e., CMF < 1) for all crashes and bike crashes. It was found that adding a bike lane is more effective in reducing bike crashes than all crashes. It was also found that the CMFs vary across the sites with different roadway characteristics. In particular, annual average daily traffic (AADT), number of lanes, AADT per lane, median width, bike lane width, and lane width are significant characteristics that affect the variation in safety effects of adding a bike lane. Some socioeconomic characteristics such as bike commuter rate and population density also have significant effect on the variation in CMFs. The findings suggest that full CMFunctions showed better model fit than simple CMFuncttions since they account for the heterogeneous effects of multiple roadway and socioeconomic characteristics. The proposed CMFunctions provide insights into bike lane design and selection of sites for bike lane installation for reducing crashes. ã
Article
There have been great efforts to develop traffic crash prediction models for various types of facilities. The crash models have played a key role to identify crash hotspots and evaluate safety countermeasures. In recent, many macro-level crash prediction models have been developed to incorporate highway safety considerations in the long-term transportation planning process. Although the numerous macro-level studies have found that a variety of demographic and socioeconomic zonal characteristics have substantial effects on traffic safety, few studies have attempted to coalesce micro-level with macro-level data from existing geographic units for estimating crash models. In this study, the authors have developed a series of intersection crash models for total, severe, pedestrian, and bicycle crashes with macro-level data for seven spatial units. The study revealed that the total, severe, and bicycle crash models with ZIP-code tabulation area data performs the best, and the pedestrian crash models with census tract-based data outperforms the competing models. Furthermore, it was uncovered that intersection crash models can be drastically improved by only including random-effects for macro-level entities. Besides, the intersection crash models are even further enhanced by including other macro-level variables. Lastly, the pedestrian and bicycle crash modeling results imply that several macro-level variables (e.g., population density, proportions of specific age group, commuters who walk, or commuters using bicycle, etc.) can be a good surrogate exposure for those crashes.
Article
Introduction: Macro-level traffic safety analysis has been undertaken at different spatial configurations. However, clear guidelines for the appropriate zonal system selection for safety analysis are unavailable. In this study, a comparative analysis was conducted to determine the optimal zonal system for macroscopic crash modeling considering census tracts (CTs), state-wide traffic analysis zones (STAZs), and a newly developed traffic-related zone system labeled traffic analysis districts (TADs). Method: Poisson lognormal models for three crash types (i.e., total, severe, and non-motorized mode crashes) are developed based on the three zonal systems without and with consideration of spatial autocorrelation. The study proposes a method to compare the modeling performance of the three types of geographic units at different spatial configurations through a grid based framework. Specifically, the study region is partitioned to grids of various sizes and the model prediction accuracy of the various macro models is considered within these grids of various sizes. Results: These model comparison results for all crash types indicated that the models based on TADs consistently offer a better performance compared to the others. Besides, the models considering spatial autocorrelation outperform the ones that do not consider it. Conclusions: Based on the modeling results and motivation for developing the different zonal systems, it is recommended using CTs for socio-demographic data collection, employing TAZs for transportation demand forecasting, and adopting TADs for transportation safety planning. Practical applications: The findings from this study can help practitioners select appropriate zonal systems for traffic crash modeling, which leads to develop more efficient policies to enhance transportation safety.
Article
With rapid changes in land use development along suburban arterials in Shanghai, there is a corresponding increase in traffic demand on these arterials. To accommodate the local traffic needs of high accessibility and efficiency, an increased number of signalized intersections and accesses have been installed. However, the absence of a defined hierarchical road network, together with irregular signal spacing and access density, tends to deteriorate arterial safety. Previous studies on arterial safety were generally based on a single type of road entity, either intersection or roadway segment, and they analyzed the safety contributing factors (e.g. signal density and access density) on only that type of road entity, while these suburban arterial characteristics could significantly influence the safety performance of both intersections and roadway segments. Macro-level safety modeling was usually applied to investigate the relationships between zonal crash frequencies and demographics, road network features, and traffic characteristics, but the previous researchers did not consider the specific arterial characteristics of signal density and access density. In this study, a new modeling strategy was proposed to analyze the safety impacts of zonal roadway network features (i.e., road network patterns and road network density) along with the suburban arterial characteristics of signal density and access density. Bayesian Conditional Autoregressive Poisson Log-normal models were developed for suburban arterials in 173 traffic analysis zones in the suburban area of Shanghai. Results identified that the grid pattern road network with collector roads parallel to arterials was associated with fewer crashes than networks without parallel collectors. On the other hand, lower road network density, higher signal density and higher access density tended to increase the crash occurrence on suburban arterials.