ArticlePDF Available

Estimating a mixed-profile MDCEV: case of daily activity type and duration

Authors:

Abstract and Figures

Abstract: "Multiple Discrete Continuous Extreme Value (MDCEV) has become popular in the past years. Yet, the model suffers from an ‘empirical identification’ issue that is mainly due to inter-relations between two of its parameters, α and γ. This paper presents a hybrid optimization paradigm (named HELPME) to address this issue in a basic MDCEV formulation and take full advantage of the model by estimating a ‘mixed-profile.’ HELPME benefits from a coarse-to-fine search strategy, in which a customized Electromagnetism-like meta-heuristic precedes a gradient-based approach. The Atlanta Regional Travel Survey (2011) is used to empirically analyze performance of HELPME as well as significance of the accuracy gap between the mixed-profile, and α and γ profiles. As part of the results, it is observed that in-sample fit is significantly improved, percentage error of out-of-sample prediction is reduced up to 97% in a 90% confidence level, and bias of out-of-sample predictions are reduced up to 67%." Source Code: You can find source code of HELPME (in the MATA language of Stata) along with the data utilized in the paper at: https://www.dropbox.com/sh/lhq4cirywuv6f2p/AABpvZJ9C6bvOZNggyUrkf31a?dl%C2%A0=%C2%A00
Content may be subject to copyright.
Estimating a Mixed-profile MDCEV: Case of Daily Activity
Type and Duration
Ali Shamshiripour1 and Amir Samimi 2,*
1PhD Student, Department of Civil and Materials Engineering, University of Illinois at Chicago,
Chicago, IL, USA
2 Assistant Professor, Department of Civil Engineering, Sharif University of Technology, Tehran, Iran
* Corresponding author, Email: asamimi@sharif.edu
Abstract
MDCEV has become popular in the past years. Yet, the model suffers from an ‘empirical
identification issue that is mainly due to inter-relations between two of its parameters, α and γ.
This paper presents a hybrid optimization paradigm (named HELPME) to address this issue in a
basic MDCEV formulation and take full advantage of the model by estimating a mixed-profile’.
HELPME benefits from a coarse-to-fine search strategy, in which a customized
Electromagnetism-like (EML) meta-heuristic precedes a gradient-based approach. The Atlanta
Regional Travel Survey (2011) is used to empirically analyse performance of HELPME as well
as significance of the accuracy gap between the mixed-profile, and α- and γ-profiles. As part of
the results, it is observed that in-sample fit is significantly improved, percentage error of out-of-
sample prediction is reduced up to 97% in a 90% confidence level, and bias of out-of-sample
predictions are reduced up to 67%.
Keywords: Maximum-likelihood Estimation; Electromagnetism; MDCEV; HELPME.
1
1. Introduction
The basic Multiple Discrete Continuous Extreme Value (MDCEV) model has an ‘empirical
identification issue between satiation () and translation () parameters, as Bhat (2008) states:
Clearly, both these effects operate in different ways, and different combinations of their values lead to
different satiation profiles. However, empirically speaking, it is very difficult to disentangle the two
effects separately. The empirical unidentifiability has been first introduced in Kenny (1979) as a
situation in which although two parameters of a model are theoretically identifiable, it is still very
difficult to estimate them using the data at hand (Kenny 1979 and Vij & Walker 2014). Given this
definition, the identification issue of MDCEV is theoretically
1
resolvable.
To deal with the empirical identification issue, Bhat (2008) proposed two alternative
approaches based on imposing different restrictions to the model. The first approach is to impose
predefined restrictions on one or both parameters, estimate the restricted profiles, and then choose the
best model (Bhat 2008, p. 9-10). Particularly, an -profile sets to 1 for all goods k, and a -profile
assumes that all values approach zero. We refer to these profiles as conventional, since they have
extensively been used in past studies including long-distance travelling analysis (Van Nostrand et al.
2013), household activity-accompaniment patterns (Bhat et al. 2013), energy consumption behaviour
(Yu et al. 2011), and annual vacation time-use (LaMondia et al. 2008). As a second approach, Bhat
(2008, p. 10) also suggests picking either of the profiles as priori and experimentally try to find proper
1
By definition, two unknown parameters of a model are theoretically unidentifiable if two distinct combinations
of their values could be found, such that they both result in equal (not just similar) distributions of the outcome
variable (Walker 2007 and Vij & Walker 2014). For instance, variance of error term (σ) and coefficients of
attributes (β) in an MNL formulation are known to be theoretically unidentifiable (Train 2009). That is, there is
no way to estimate them both in that the probability distributions are shown to be functions of β/σ rather than β
and σ.
2
fixed values of (for a -profile) or (for an -profile). The current paper proposes a systematic
search paradigm (HELPME) to fulfil the second proposition of Bhat (2008, p. 10). In fact, HELPME
makes it possible to find the best values in which  (or) can be fixed and take full advantage of the
model.
The remainder of this paper is organized as follows. First, we present some background
knowledge on MDCEV including interpretation of each of its parameters. After that, HELPME is
expressed in detail. Then, an empirical analysis on outdoor activity types and durations is performed
using the 2008 version of the basic MDCEV (i.e., which adopts IID extreme value error terms) and the
Atlanta Regional Travel Survey (ARC 2011) dataset. Particularly, HELPME’s performance and
differences between conventional- and mixed-profiles are discussed in detail. Lastly, the paper
concludes with discussions on remarkable findings of the present research as well as directions for
future studies.
2. MDCEV formulation and interpretation
Traditional discrete choice models deal with choice set of perfectly substitutable alternatives. In many
choice situations, however, the consumer demands for certain amounts of multiple alternatives, given a
limited budget. Various methods have been developed to analyse such choices (Manchanda et al. 1999;
Hendel 1999; Edwards & Allenby 2003; Bhat & Srinivasan 2005; Wales & Woodland 1983; Kim et al.
2002; and Bhat 2005 and 2008). A closed-form formulation is MDCEV, that is first introduced in Bhat
(2005). In Bhat (2008), then, the utility function of that model was modified to a more easy-to-
interpret and general function (Bhat 2008). Meanwhile, other variants of the model have been
proposed (Bhat et al. 2006; Pinjari & Bhat 2010; Vasquez & Hanemann 2008; Eluru et al. 2009;
Sobhani et al. 2013; and Sobhani et al. 2014), forming the large family of MDCEV models. The
current study, however, only focuses on the basic MDCEV formulation proposed in Bhat (2008).
3
In 2008 version of MDCEV, total utility that an individual acquires from consuming units of
every good k is determined by Eq. 1 (Bhat 2008), assuming: (1) independent observed utility portions
and (2) non-negative marginal utilities. In this function, , , and are three sets of parameters that
should satisfy certain conditions ( , , and ). The term captures baseline (i.e.,
pre-consumption) marginal utility of k, that is the slope of its utility function with respect to , when
(Bhat 2008, p. 7). This parameter’s value is given by Eq. 2, which guarantees positivity. In Eq.
2, and , respectively, stand for the vector of alternative goods’ characteristics and their
corresponding coefficients. Moreover, denotes the error term of kth good. In the basic MDCEV
model which is explored in the current paper, error terms are assumed to follow an IID extreme value
distribution with the scale parameter of , which may further be normalized to 1 (Bhat 2008).
=
+= K
kk
k
k
k
k
k
C
CU 111)(
(1)
)exp( kkk z
+=
(2)
is the Satiation parameter and sensitizes the diminishing marginal utility growth rate of each
good as its consumption quantity increases by one unit. is named Translation parameter as its main
role is to prevent indifference curves from getting asymptotic to axes in the positive orthant, by
shifting the points at which they get asymptotic to the axes (Bhat 2008). In addition, also influences
satiation patterns by changing slope of indifference curves. The empirical identification problem in
MDCEV, is mainly resulted from such interrelationships between roles of and .
On the interpretation side, indicates insatiable consumption
2
of good k, regardless of
what the value of is (Bhat 2008). In the case that remains constant, lower values of produce
2
In a hypothetical situation of for all k, therefore, the model collapses to a simple MNL (Bhat 2008).
4
more satiable utility profiles (Bhat 2008). Theoretically speaking,  corresponds to the case of
immediate satiation (Bhat 2008). In the case of constant , further, the satiation effect intensifies as
decreases (Bhat 2008). As mentioned, however, α and γ simultaneously affect rate of the utility
function’s curvature in any version of MDCEV. Given such a shared role, one can hardly interpret
neither α nor γ alone, when they both are subject to variations over alternatives (e.g. in a mixed-
profile). Albeit, this is not a limitation of the model, since the interpretation could simply be
accomplished through development of consumption-utility plots. Such graphs are straight forward to
draw and easy to interpret. One may simply set to any unified fixed value for all k, plot the utility
value of each good versus consumption amount of that good, and visually interpret the results.
Let and E denote price of the kth good and the budget, respectively. In this notation, the
budget constraint implies that
 , where (Bhat 2008). Employing Lagrangian
approach and KT conditions, then probability function that the estimated optimal expenditure
allocations” match the expenditure patterns observed in the data is derived as Eq. 3. In this formula, i
and M, respectively, denote the index and the total number of goods that each individual consumes.
( )
( )
( )
!1
11
0,...,0,,...,,
1
1
1
1
1
**
2
*
1
=
=
=
=
=
M
e
e
eeeP M
K
k
V
M
i
V
M
ii
M
i
i
M
M
k
i
(3)
Where:
( )
i
ii
i
iii p
p
e
zV ln1ln1*
++=
(4)
+
=
iii
i
ipe
*
1
(5)
Having the choice probability function in Eq. 3, unknown parameters of a mixed-profile for the basic
MDCEV model can be estimated by maximizing the model’s log-likelihood
( )
=
=N
nn
PLL 1
ln
, subject
to , , and for every k. The maximization, in this paper, is conducted using
5
HELPME to eliminate the need for imposing further constraints (e.g.  in a γ-profile or 
in an α-profile).
3. HELPME
Various methods have been used for likelihood maximization. A popular gradient-based algorithm is
BFGS (Broyden-Fletcher-Goldfarb-Shanno). BFGS is a quasi-Newton method that iteratively utilizes
first order gradients to approximate the Hessian (Greene 2003). However, BFGS dramatically loses its
efficiency as the model becomes more complex, similar to other gradient-based algorithms (Train
2009). On the other hand, are meta-heuristic algorithms that are based on random movements inspired
from ad-hoc rules, instead of gradient information. ElectroMagnetism-Like (EML) is a popular meta-
heuristic that has been proven successful in different disciplines (Yurtkuran & Emel 2010 and Liao et
al. 2015). Efforts (Liu & Mahmassani 2000) have also been made to combine meta-heuristics and
gradient-based methods to boost the optimizers’ capabilities. To overcome the empirical identification
problem of the MDCEV model, we introduced a hybrid paradigm combining EML and BFGS in a
coarse-to-fine search strategy.
The paradigm is named HELPME (Hybrid Electromagnetism-Like Paradigm for Maximum-
likelihood Estimation). Fig. 1 shows a general scheme of HELPME. Coarse stage of HELPME
employs a customized EML algorithm to find a good starting value that is, then, fed into a recursive-
BFGS in Fine stage to fine tune. This strategy, compared to gradient-based optimizers, helps to: (1)
amplify the optimizer’s efficiency in estimation of models with complex likelihood functions and (2)
reduce chance of falling in a local optimum trap. Compared to a standard meta-heuristic, on the other
hand, it helps to: (1) reduce sensitivity of the final estimates’ accuracy to the random numbers, and (2)
guarantee achieving a point in which partial gradients are sufficiently close to zero.
EML is an evolutionary meta-heuristic that imitates natural interactions between electrically
charged particles (Birbil & Fang 2003). The algorithm initiates with a starting-generation of feasible
6
particles in the domain of problem, with each particle representing feasible values for the optimization
variables (i.e. unknowns of the model to be estimated). Then, certain amount of hypothetical electrical
charge is assigned to each particle to flag ‘good’ and ‘bad’ particles (Birbil & Fang 2003). The
electrical charge of each particle is assumed to be proportionate to the value of log-likelihood
associated with it, and determines the particles capacity to attract or repulse others (Birbil & Fang
2003). Electromagnetic force between each pair of particles is then calculated and exerted (Birbil &
Fang 2003), such that good particles attract others while bad particle repulse. These forces, eventually,
lead particles to move and produce the next generation of particles (Birbil & Fang 2003). To further
enhance chance of escaping from local optima, in each iteration, a random local search is also
implemented. The procedure iterates over and over until the ‘best’ particle (i.e. the particle with the
largest log-likelihood) gets sufficiently close to the maximum of log-likelihood function. Coarse stage
of HELPME, in addition to suggesting some customizations in each of these steps, introduces a new
direct-search heuristic (Stubborn Movement) to further boost the algorithm’s efficiency.
In each of the following three sub-sections, a step of EML is explained first, and then the
suggested customizations are discussed. After that, remaining components of HELPME are elaborated.
All the notations are defined in Table 1. Furthermore, marginal effect of each EML customization on
estimating a mixed-profile MDCEV is discussed in section 4.1. Source code of HELPME along with
the data used in this paper is also accessible online
3
. For a more detailed discussion on EML, readers
may also refer to Birbil & Fang (2003) and Yurtkuran & Emel (2010).
3
Available at https://www.dropbox.com/sh/lhq4cirywuv6f2p/AABpvZJ9C6bvOZNggyUrkf31a?dl=0
7
Figure 1. General scheme of HELMPE algorithm
8
Table 1. Description of notations used in HELPME
Type
Description
Module
Variables
Parameters
Random Local
Search
Inter-particle
Interactions
Stubborn
Movement
Number of dimensions of the problem
Number of particles
Upper bound of dth coordinate
Lower bound of dth coordinate
dth coordinate of pth particle in gth generation
Vector of coordinates of pth particle in gth generation
Vector of coordinates of the particle which has the best function value
Objective function value associated with pth particle in gth generation
A random number drawn from the standard uniform distribution in gth
generation
A random number drawn from the standard uniform distribution in gth
generation
A random number drawn from the standard uniform distribution for dth
coordinate in gth generation
A predefined parameter between 0 and 1
A generation-specific value of
Maximum feasible movement length
Maximum feasible movement length for dth coordinate in gth generation
Number of local search iterations
A vector denoting the maximum feasible movement range
A parameter between 0 and 1, determining exponent of the distance term
in the total force formula of EML
Simulated charge of pth particle in gth generation
Total force exerted to dth dimension of pth particle in gth generation
Vector of total force exerted to pth particle in gth generation
A predefined reduction factor in [0,1], determining the decrement rate of
A predefined reduction factor in [0,1] to alter number of SM iterations
A random number drawn from a standard uniform distribution
1: If  / 0: Otherwise
9
3.1. Initialization phase
In the original EML algorithm, the initial generation (e.g. all the initial particles) is randomly drawn
from a uniform distribution in the feasible search domain (Birbil & Fang 2003). Although the strategy
of uniform particle distribution helps to avoid local optimum trap, it makes the algorithm inefficient on
the run time side. Considering this, a more complex procedure is adopted in HELPME. The procedure
is elaborated bellow.
First, a simplified-model’s estimates are deployed to generate a ‘good’ feasible solution. This
particle is termed ‘key-particle’ and is incorporated in the starting-generation. We estimated a -profile
MDCEV to find the key-particle. After that, feasible domain of the likelihood-maximization problem
is rescaled, by normalizing each coordinate of the domain to its corresponding coordinate of the key-
particle. All other particles are generated in the rescaled feasible domain
4
. Notably, the key-particle
also takes the form of an all-ones vector in this domain. In the next step, a few Stubborn Movements
5
are performed to improve the normalized, random particles. Last step adds a ‘mediocre’ particle to the
set of generated particles. Each coordinate of this particle is equal to the mean value of that coordinate
over other particles. These steps are also depicted in Fig. 1.
3.2. Random local search
Efforts in a random local search are focused on finding a better position for each particle in its own
vicinity in a way that chance of trapping in a local optimum reduces. Let
denote current position of
4
Zhang et al. (2013) evidenced that accuracy of EML is highly sensitive to the relative scale of a problem’s
dimensions. The normalization step proposed in the present paper is intended to dampen the scale differences,
especially between β parameters from one side, and α and γ parameters from the other.
5
Stubborn Movement is a simple direct-search routine developed as part of HELPME. See section 3.4.
10
a particle, and
denote position of a vicinity point. The random local search in the original EML is
performed using Eqs. 6 and 7, as discussed in the following. First, the maximum feasible movement
length (L) is calculated as a predefined percentage (denoted by δ) of the largest upper bound-to-lower
bound distance among all coordinates. Then, each coordinate of the current point (
) is moved
randomly either toward the upper or the lower bound by a random step size ().
Lxy dg
p
dg
p
dg = 2
(6)
dd
dluL = max
(7)
On the other hand, HELPME’s random local search is performed using Eqs. 8 to 10 instead of Eqs. 6
and 7. In Eq. 10, MinIter is the minimum number of generations that is required to evolve. The
customized procedure is different from the original EML in two ways. First, it differentiates L across
each dimension (see Eq. 9). Having a constant L for all d may lead some coordinates of a particle to go
further the feasible domain. This increases chance of assessing infeasible points and, thereby,
diminishes overall performance of the algorithm. Second, it differentiates L from one generation to
another (see Eq. 10). While searching in a broader neighbourhood is more engaging in early
generations, smaller movements are expected in the final runs. As another advantage of Eqs. 8 to 10
over Eqs. 6 and 7, one may avoid calibrating and  by assigning a large number (e.g. 1) to
and a small value (e.g. 0.01) to without expecting a considerable performance deterioration.
Last customization, again, deals with the issue of infeasible particles. Despite that the aforementioned
changes reduce chance of producing infeasible particles, feasibility is not guaranteed yet. There are
different strategies to maintain feasibility during a random search, among which we implemented a
common and simple approach that is disregarding infeasible particles (El-Gallad & Sallam 2001). A
general scheme of HELPME’s random local search is also depicted in Fig. 2.
11
dgdg
p
dg
p
dg Lxy = 2
(8)
( )
ddgdg luL =
(9)
( )
g
Min Iter
g
MinIter
e
=
1
ln
1
(10)
Figure 2. General scheme of the proposed random local search
3.3. Inter-particle interactions
Simulated particle charges are determined using Eq. 11 that assumes
as a function of
relative deviation of each particle’s log-likelihood from the best particle’s log-likelihood. In this way,
higher charges are assigned to particles with more log-likelihoods. The relative deviation is also
multiplied by number of dimensions of the problem (D), for the sake of computational stability (see
Birbil & Fang 2003 for more details). Eq. 11 assigns positive charges to all particles, so direction of
forces should be determined through an if-then rule. The total electromagnetic force exerted to each
particle is calculated using Eq. 12, inspired by the Coulomb formula. In addition, the parameter is
12
conventionally set to 1. However, many studies (Liang et al. 2006; Zhang et al. 2013) argued that when
a mediocre particle is far from the best particle, the force between them may be considerably lessened
by the quadratic Euclidean distance term in the denominator of Eq. 12. These studies, consequently,
suggested a simplified total force formula, in which the quadratic distance term is removed by setting
to zero.
( ) ( )
( ) ( )
= =
P
l
best
g
l
g
best
g
p
g
p
gXHXH
XHXH
Dq
1
exp
(11)
( ) ( ) ( )
( ) ( ) ( )
=
mp p
g
m
g
p
g
m
g
p
g
m
g
m
g
p
g
p
g
m
g
p
g
m
g
p
g
m
g
p
g
m
g
p
g
XHXHif
XX
qq
XX
XHXHif
XX
qq
XX
F
,
,
2
2
(12)
HELPME applies a combination of the two aforementioned strategies to increase efficiency of
the heuristic. We found that the simplified total force formula decreases the log-likelihood’s increment
rate in first generations, while it increases tendency of particles to move to better points in final
generations. Therefore, in first generations of HELPME, total forces are calculated using conventional
Coulomb formula (i.e. ); and if position of the best particle (
) is not improved after a certain
number of iterations, then the parameter will be decreased by a pre-defined reduction factor󰇛󰇜.
This procedure is depicted in Fig. 1. The proper value of could be obtained in a calibration process.
Having the total forces, each particle p is then moved toward
with a 0 to 1 random step size
󰇛󰇜. The movement is conducted using Eq.13, where RNGdg stands for a vector of components
(calculated by Eq. 14) denoting the maximum feasible movement range toward upper or lower bounds.
( )
dg
p
g
p
g
g
p
g
p
gRNG
F
F
XX
+=
+1
(13)
13
=0,
0,
p
dgd
p
dg
p
dg
p
dgd
dg fiflx
fifxu
RNG
(14)
3.4. Stubborn Movement
The Stubborn Movement (SM) is developed in this paper as a direct-search heuristic. SM considers a
random direction in each of its own iterations, and obstinately searches to find the best point in that
direction. This heuristic is primarily proposed to increase the chance of escaping from local optima.
So, where local optimum trap is not a concern, SM could be avoided to save rum time. In this regard,
number of SM iterations is diminished by a predefined reduction factor () after each call, as shown in
Fig. 1.
The procedure of random direction finding in SM is mainly inspired from the FW (Frank &
Wolfe 1956) algorithm which applies a convex combination of two points’ coordinates: current
position of the particle 󰇛
󰇜 and a second position of it󰇛
󰇜. Though, the key difference between SM
and FW is attributed to the way that
is determined. That is, in FW
is found based on the first-
order approximation of the objective-function around
, but in SM it is determined through a random
procedure based on mirror reflection of
. To be specific, SM determines
by mirror-reflecting half
of
’s current coordinates to the feasible domain’s mid-point, as shown in Eqs. 15 to 17. Reflected
coordinates are randomly chosen using .
( )
( )
d
d
d
p
dgdd
p
dg l
u
uylu
y+
+
=2
*
(15)
Where:
( )
**
3
*** 1p
dg
p
d
p
dg
pp
dg ybyby =
(16)
( )
d
dd
d
p
dgd
p
dg u
lu
lxu
y
=2
**
(17)
14
Figure 3. General scheme of SM algorithm
SM finds the optimal step size using a line-search technique (see Fig. 3) similar to the
backtracking procedure (Armijo 1966). The adopted approach starts with a large initial step size, which
is iteratively decreased until all the points intermediating
and
are tested and the best possible
point is determined. However, the sufficient decrement criterion (Armijo 1966) of the original
backtracking procedure is not applied in SM, to eliminate the algorithm’s dependency on the gradient
information.
3.5. Coarse stage termination
Most meta-heuristic optimization algorithms terminate after stability of the objective value and/or the
unknown variables, or a predefined number of iterations. Birbil & Fang (2003) suggested a maximum
of 25D generations as a proper stopping criterion in the original EML. However, this criterion is not
appropriate for HELPME. In fact, HELPME could stop the meta-heuristic procedure in Coarse stage
far earlier than 25 iterations per dimension, because of the improvements in EML and, more
importantly, a preceding Fine stage. If the Coarse stage terminates too early, albeit, the optimization
process might not take full advantage of the heuristic. Thus, the following rules are set in HELPME to
terminate the Coarse stage:
At least [1.5 to 3.5] generations are evolved ( [1.5 to 3.5] ).
15
The best particle does not change in 0.1  successive generations.
The CPU-time elapsed to improve the objective value by one unit in evolving a generation
exceeds a critical value.
If the first criterion is met along with either the second or third rule, then the process would be
stopped.
3.6. Fine stage
This stage is intended to improve the best particle found in the last generation by a gradient-based
routine. Here we employed a recursive-BFGS which is depicted in Fig. 1. To be specific, Satiation and
Translation parameters are sequentially fixed in their current values, while other parameters are
improved by LSIter1 iterations of BFGS. The sequence continues until either the relative change in the
log-likelihood value or the maximum relative change in the parameters reaches pre-specified critical
values. Having met the stopping criteria, either Translation or Satiation parameters are fixed again and
the rest of the parameters are estimated precisely using the standard BFGS. Fixing Translation or
Satiation parameters in this step, results in a Satiation-based or a Translation-based mixed-profile,
respectively.
4. Empirical analysis
This section is devoted to comprehensive empirical analyses on: (1) performance of
HELMPE’s Coarse and Fine stages in estimating mixed-profiles for a basic MDCEV formulation, and
(2) Comparing mixed- and conventional-profiles from various perspectives, including different in-
sample accuracy metrics, mean percentage of out-of-sample prediction errors, biasedness of out-of-
sample predictions, and overall satiation curves. All two-sample comparisons are conducted using the
Welchs t-test (Welch 1947). A quick introduction of the data is presented below, preceded by a
detailed discussion on the empirical analyses.
16
Due to concerns on practical credibility of the results
6
, experiments in this section are
conducted using an empirical data, Atlanta Regional Travel Survey (ARC 2011). The dataset is fairly
recent, and is rich in terms of diversity of explanatory variables and number of observations. Also, it is
freely available, making it easy for interested readers to verify the results. Total number of 25,810
individuals participated in the survey, 22,249 of whom reported at least one outdoor activity in the
survey day. We randomly selected fifty percent of the 22,249 observations for model estimation and
reserved the rest for comparing out-of-sample prediction experiments.
Figure 4. Outdoor activity types and durations in the 50% estimation-sample
Four MDCEV models for outdoor activity types and durations are estimated. The activities are
aggregated into nine categories, namely Work, Education, Shop, Social, Recreation, Healthcare, Eat
Meal, Maintenance, and Other. Fig. 4 illustrates time expenditure patterns of the 50% estimation-
sample. All the models adopt the basic structure of MDCEV which assumes IID extreme value error
6
Synthetic data fail to truthfully represent real-life situations and, thereby, undermines practical findings such as
magnitude of change in likelihood value, run time, and prediction power.
17
terms. The models are categorized, based on their satiation profiles, in two main classes. The first class
contains models with conventional - and -profiles, which are routinely estimated using BFGS. The
second class includes Satiation-based and Translation-based mixed-profile models. Models of this
class are estimated five times using HELPME and different random numbers provided to its Coarse
stage. The algorithm proposed by Pinjari & Bhat (2010)
7
is used to predict consumption quantities
(). To obtain comparable results, random inputs to that algorithm are equally set among all the
experiments for each run. All the models have identical explanatory variables which are described in
Table 2. The estimation results are outlined in Table 3. The reader will note that, in Table 3, no t-
statistic has been reported for γ parameters in the Satiation-based mixed profile, or for α parameters in
the Translation-based mixed profile. The standard errors are not reported to restate the fact that a
mixed profile is not intended to reduce the dependency level of α and γ. These parameters are
interrelated in nature, regardless of the value on which one is fixed. Indeed, γ parameters in a Satiation-
based mixed profile (α parameters in a Translation-based mixed profile) should still be treated as fixed
values, similar to the case of conventional profiles.
7
The algorithm is originally proposed to predict within a -profile framework. Pinjari & Bhat (2010), also,
suggest modifications using the bisection technique to make the algorithm compatible with cases where  and
 are both subject to change. We use the modified version for prediction of a mixed-profile. Instead of the
bisection algorithm, though, the Newton’s algorithm is used to eliminate the need for defining proper upper and
lower bounds for λ. Also, the λ from previous iterations is used as an initial value inputted to each iteration.
18
Table 2. Description of independent variables
Name
Definition
Average
Std. Dev.
HighEdu
1: If the individual has a college degree/ 0: Otherwise
0.397
0.489
Worker
1: If the individual works/ 0: Otherwise
0.554
0.497
Age
Age of the individual
38.412
20.870
Student
1: If the individual is a student/ 0: Otherwise
0.047
0.212
Elderly
1: If the individual is older than 60/ 0: Otherwise
0.155
0.362
Male
1: If the individual is male/ 0: Otherwise
0.472
0.499
TeleWork
1: If the individual works at home/ 0: Otherwise
0.050
0.219
Homemaker
1: If the individual is a homemaker/ 0: Otherwise
0.038
0.191
White
1: If the individual is White 0: Otherwise
0.733
0.442
African
1: If the individual is African-American/ 0: Otherwise
0.191
0.393
Asian
1: If the individual is Asian/ 0: Otherwise
0.020
0.140
Nchild
Number of household children
1.099
1.243
HighIncome
1: If yearly income of the individual’s household is more
than 60,000 dollars/ 0: Otherwise
0.669
0.470
19
Table 3. Estimation results for mixed-profile MDCEV models
Activity
Variable
-profile
Satiation-based
Mixed-profile
-profile
Translation-based
Mixed-profile
Coeff. (t-Value)
Coeff. (t-Value)
Coeff. (t-Value)
Coeff. (t-Value)
Work
Constant
HighEdu
-0.21 (-4.92)
-0.19*** (-4.77**)
-0.20 (-4.88)
-0.19** (-4.77**)
Worker
6.29 (17.70)
6.30*** (17.74***)
6.26 (17.62)
6.30** (17.74**)
Education
Constant
7.05 (19.76)
7.09*** (19.86***)
7.09 (19.88)
7.09** (19.86**)
Age
-0.09 (-31.91)
-0.08*** (-31.99***)
-0.09 (-32.15)
-0.08** (-32.00**)
HighEdu
-1.69 (-10.27)
-1.69*** (-10.31***)
-1.70 (-10.31)
-1.69** (-10.31**)
Student
0.96 (12.02)
0.96*** (12.02***)
0.96 (12.02)
0.96** (12.02**)
Shop
Constant
5.67 (15.89)
5.62*** (15.76***)
5.61 (15.71)
5.62*** (15.76***)
Elderly
0.62 (10.69)
0.62** (10.69**)
0.60 (10.45)
0.62*** (10.69***)
Male
-0.34 (-7.67)
-0.34*** (-7.90***)
-0.34 (-7.74)
-0.34*** (-7.90***)
TeleWork
2.27 (16.10)
2.24** (15.94**)
2.21 (15.71)
2.24** (15.94**)
Homemaker
0.63 (7.22)
0.65** (7.45**)
0.61 (7.03)
0.65*** (7.44***)
HighIncome
-0.21 (-4.65)
-0.22*** (-4.91***)
-0.21 (-4.70)
-0.22*** (-4.91***)
Nchild
-0.17 (-8.42)
-0.16*** (-7.92***)
-0.16 (-7.89)
-0.16*** (-7.92***)
Social
Constant
4.79 (13.46)
4.76*** (13.39***)
4.76 (13.40)
4.76*** (13.39***)
Asian
-0.50 (-1.98)
-0.49*** (-1.97***)
-0.49 (-1.96)
-0.49*** (-1.97***)
HighEdu
-0.36 (-5.72)
-0.35*** (-5.67***)
-0.35 (-5.68)
-0.35*** (-5.67***)
Worker
-0.20 (-3.23)
-0.19*** (-3.06***)
-0.20 (-3.13)
-0.19** (-3.06**)
TeleWork
2.17 (13.28)
2.17** (13.26**)
2.13 (13.05)
2.17** (13.26**)
Recreation
Constant
4.66 (12.80)
4.64*** (12.76***)
4.63 (12.74)
4.64*** (12.76***)
African
-0.36 (-4.18)
-0.36** (-4.16**)
-0.36 (-4.14)
-0.36*** (-4.15***)
Age
-0.01 (-5.25)
-0.01** (-5.18**)
-0.01 (-5.15)
-0.01*** (-5.18***)
Worker
-0.35 (-5.08)
-0.34** (-4.91**)
-0.34 (-4.91)
-0.34*** (-4.91***)
Student
-0.42 (-2.71)
-0.41** (-2.64**)
-0.41 (-2.64)
-0.41*** (-2.64***)
TeleWork
2.37 (14.61)
2.38** (14.64**)
2.34 (14.41)
2.38** (14.64**)
HighIncome
0.42 (5.91)
0.41** (5.79**)
0.41 (5.77)
0.41*** (5.79***)
Healthcare
Constant
4.19 (11.56)
4.18*** (11.54***)
4.18 (11.55)
4.18*** (11.54***)
White
-0.26 (-3.32)
-0.25*** (-3.26***)
-0.26 (-3.28)
-0.25*** (-3.26***)
Elderly
1.02 (12.42)
1.01*** (12.30***)
1.01 (12.34)
1.01*** (12.30***)
Male
-0.43 (-5.67)
-0.42*** (-5.60***)
-0.42 (-5.64)
-0.42*** (-5.60***)
Worker
-0.40 (-5.13)
-0.39*** (-5.03***)
-0.40 (-5.10)
-0.39*** (-5.03***)
TeleWork
2.40 (12.74)
2.41** (12.73**)
2.37 (12.56)
2.41** (12.73**)
Eat Meal
Constant
4.25 (11.76)
4.22*** (11.68***)
4.22 (11.67)
4.22*** (11.68***)
White
0.61 (8.70)
0.60** (8.62**)
0.60 (8.54)
0.60*** (8.62***)
Elderly
0.18 (2.52)
0.19** (2.62**)
0.19 (2.55)
0.19** (2.62**)
TeleWork
2.25 (15.04)
2.26** (15.06**)
2.21 (14.78)
2.26** (15.06**)
HighIncome
0.28 (4.69)
0.28*** (4.77***)
0.28 (4.63)
0.28*** (4.77***)
Nchild
-0.23 (-9.19)
-0.22*** (-8.99***)
-0.23 (-8.90)
-0.22*** (-8.99***)
Maintenance
Constant
4.60 (12.85)
4.57*** (12.78***)
4.57 (12.77)
4.57*** (12.78***)
Age
0.02 (12.65)
0.01*** (11.99***)
0.01 (12.04)
0.01*** (11.99***)
Worker
0.31 (6.57)
0.33*** (6.96***)
0.34 (7.16)
0.33** (6.95**)
TeleWork
2.27 (16.52)
2.28** (16.56**)
2.28 (16.52)
2.28** (16.56**)
Other
Constant
5.48 (15.45)
5.44*** (15.32***)
5.44 (15.33)
5.44*** (15.32***)
African
0.16 (3.08)
0.15*** (2.97***)
0.15 (3.03)
0.15** (2.97**)
Male
-0.21 (-5.12)
-0.20*** (-4.94***)
-0.21 (-5.00)
-0.20*** (-4.94***)
TeleWork
1.88 (13.03)
1.90** (13.13**)
1.86 (12.87)
1.90** (13.13**)
Note: Symbols ***, **, and *, respectively, mean that the corresponding CV is less than 1E-4, 1E-3, and 1E-1.
20
Table 3. (Continued) Estimation results for mixed-profile MDCEV models
Activity
Variable
-profile
Satiation-based
Mixed-profile
-profile
Translation-based
Mixed-profile
Coeff. (t-Value)
Coeff. (t-Value)
Coeff. (t-Value)
Coeff. (t-Value)
Work
Satiation
Parameters
0.98 (5.73)
-8.90 (-50.50)
0.00 ()
-11.85 ()
Education
0.98 (6.26)
-9.32 (-63.08)
0.00 ()
-10.13 ()
Shop
0.73 (25.30)
-2.97*(-141.43*)
0.00 ()
-2.99*()
Social
0.87 (20.83)
-0.04*(-1.11*)
0.00 ()
-0.04*()
Recreation
0.83 (20.25)
-4.75*(-138.85*)
0.00 ()
-4.83*()
Healthcare
0.87 (13.85)
-0.33*(-5.89*)
0.00 ()
-0.33*()
Eat Meal
0.75 (21.41)
-8.79*(-335.01*)
0.00 ()
-8.52*()
Maintenance
0.74 (26.99)
0.57**(26.26***)
0.00 ()
0.57**()
Other
0.66 (23.34)
0.38***(19.44***)
0.00 ()
0.38**()
Work
Translation
Parameters
1.00 ()
35687.56 ()
6529.32 (21161.24)
46268.96 (268217.86)
Education
1.00 ()
23423.83 ()
2063.36 (14966.08)
25516.58 (172887.15)
Shop
1.00 ()
179.01*()
33.46 (1014.80)
180.61*(7246.90*)
Social
1.00 ()
232.04*()
211.07 (3955.86)
232.03*(4287.10*)
Recreation
1.00 ()
805.23*()
115.14 (2394.70)
820.98*(21753.72*)
Healthcare
1.00 ()
186.14*()
127.24 (1777.18)
186.08*(2662.12*)
Eat Meal
1.00 ()
596.06*()
45.68 (1131.14)
582.78*(20554.35*)
Maintenance
1.00 ()
6.99**()
30.75 (821.18)
6.99**(151.81**)
Other
1.00 ()
6.79***()
14.74 (432.47)
6.79***(175.14***)
Scale Parameter
1.00 ()
1.00 ()
1.00 ()
1.00 ()
Log-likelihood Value
-93,613.35
-87,882.19***
-88,766.93
-87,882.12***
Bayesian Information
Criterion (BIC)
-187,729.79
-176,351.32***
-178,036.96
-176,351.11***
Akaike Information
Criterion (AIC)
-187,334.69
-175,890.37***
-177,641.86
-175,890.23***
CPU Time (minutes)
30.51
70.29*
45.71
69.76*
Coarse Stage Parameters
P=5, LSIter=5, 
=0.9, =0.9, =1, =0.01
Fine Stage Termination
Parameters
Log-likelihood tolerance: 1E-6,
Maximum Parameter Tolerance: 1E-6
Note: Symbols ***, **, and *, respectively, mean that the corresponding CV is less than 1E-4, 1E-3, and 1E-1.
4.1. Performance of HELPME in MDCEV estimation
As mentioned above, providing rough solutions and being considerably sensitive to random
inputs are among the most important disadvantages of meta-heuristics compared to gradient-based
21
approaches. This section discusses performance of HELPME in estimating mixed-profiles for a basic
MDCEV. The performance is explored after completion of each stage.
After termination of Coarse stage, we explored three aspects, namely optimality of the final
log-likelihood value, sensitivity of the final log-likelihood value to random inputs, and prospect of
improving the best particle. The first two aspects are quantified by, respectively, average and
coefficient of variation (CV) of final log-likelihoods in different runs, and the third one is measured by
calculating portion of generations with an improved best-particle. Each of these metrics is calculated
for original EML and three of its variants to capture marginal effects of proposed customizations. The
results are summarized in Table 4. In this table, bolded rows are resulted from recommended stopping
criteria, while other rows are provided letting the algorithm run for a longer or shorter time than usual.
Per results, applying the new random local search causes a 59% increase in final log-likelihood (i.e.,
from -617,885 to -251,346) at an 87% confidence level, and over 51% decrease in the final log-
likelihood’s CV (i.e., from 54% to 26%). Applying the new initialization phase, further, results in a
64% growth of final log-likelihood (i.e., from -251,346 to -89,169) at a 94% confidence level, and a
99% decline in CV of the final log-likelihood (i.e., from 26% to 0.2%). The proposed force calculation
method also causes a 0.17% improvement of final log-likelihood (i.e., from -89,169 to -89,021), which
is significant at an 85% level of confidence. Lastly, the best particle is about 22% and 9% more likely
to be improved as new initialization phase and Force Calculation method are used, respectively. In
addition, Fig. 5 depicts log-likelihood versus the heuristic’s generation number in each of the five runs.
As can be seen, the 5th run provides the best EML estimates in terms of log-likelihood value. In this
run, log-likelihood of EML’s best initial particle is about -900,000, that is almost ten times the starting
point of HELPME (i.e., Key-particles log-likelihood). This figure also depicts considerable
improvement of final log-likelihood’s stability, when HELPME’s Coarse stage replaces the original
22
EML. Final log-likelihood of original EML’s particles range roughly between -300,000 to -1,100,000,
while that of HELPME’s Coarse stage range between -89,000 to -89,500.
Table 4. Marginal effects of each modification on EML
Modification
Convergence
Log-likelihood Trend
Initialization and SM
Random Local Search
Force Calculation
Number of
Generations
Average CPU Time
(minutes)
Log-likelihood Value
Portion of Generations with
Improved
Best Particle
Average
CV (%)
Average
CV (%)
120
2.57
-1,393,434
25
1.17%
167
1600
34.62
-617,885
54
10.26%
27
120
2.83
-591,849
41
4.67%
79
650
17.69
-251,346
26
8.87%
58
105
5.12
-89,169
0.2
32.67%
14
664
19.87
-88,697
0.04
17.10%
16
106
5.23
-89,021
0.2
41.82%
21
648
19.95
-88,708
0.08
17.31%
19
23
(a) Original EML
(b) Coarse stage of HELPME
Figure 5. Maximization trend of the unconstrained model’s log-likelihood
24
After termination of Fine stage, we explored two aspects: stability of the model’s estimates, and
stability of the final likelihood value. Based on the model estimation results in Table 3, the CV of all
the HELPME’s outputs are considerably low. For mixed-profile models, the CVs of log-likelihood at
convergence are less than 1E-4. Besides, the CVs of all parameters of the baseline marginal utility are
less than 1E-3. Similar results are also achieved for other outputs (see t-values and CPU-times in Table
3), revealing the ignorable sensitivity of the HELPME’s accuracy to random numbers.
4.2. Distinctions between mixed-profile, α-profile and -profile
Estimating a mixed-profile model costs extra computational time (see Table 3). However, the mixed-
profile outperforms conventional profiles on the accuracy side, as evidenced in this section. A detailed
discussion is presented on results of various model selection metrics, namely: (1) BIC, AIC, and
likelihood ratio test as in-sample metrics, (2) mean absolute percentage of prediction errors, and
biasedness of predictions as out-of-sample measures, and (3) overall satiation patterns.
BIC and AIC are two widely-used model selection metrics based solely on in-sample
information. Models with higher BIC and/or AIC values are recognized to have better in-sample fits.
Table 3 outlines values of these metrics, setting the number of parameters of MDCEV models with the
conventional and mixed-profiles to, respectively, 54 and 63. Per both BIC and AIC results in Table 3,
the γ-profile model has a 5%-better fit compared to the α-profile, and the Translation-based mixed-
profile is about 1% better-fitted compared to the γ-profile. Moreover, likelihood ratio statistic is
performed to capture statistical significance of the gap between in-sample fits of the -profile MDCEV
and the Translation-based mixed-profile MDCEV. Likelihood ratio is calculated as 1769.62, which is
statistically significant at a 99.99% level. As none of these in-sample tests detect a significant
difference between the two mixed-profiles, we termed ‘Translation-based mixed-profile’ shortly as
‘mixed-profile’ in reminder of the paper.
25
Out-of-sample prediction power of -profile and mixed-profile models are investigated using
the reserved 50% prediction-sample. 30 sub-samples are randomly drawn with replacement from the
prediction-sample, each of which containing 20 percent of the prediction-sample. Then, observed
activity durations are recorded for each sub-sample. After that, the two models are applied on each
sub-sample, producing total of 30 sets of activity duration prediction for each model. All predictions
are performed using 1000 sets of simulated error terms (), inputted to the prediction algorithm (see
Pinjari & Bhat 2010). Moreover, stochastic budget determinations are avoided
8
by using observed total
activity durations, for the sake of simplicity. Prediction records are compared using two different out-
of-sample metrics, as discussed below.
The first out-of-sample metric is MAPE (Mean Absolute Percentage Error), which is argued to
be a reliable aggregate representative of the individual-level gaps between predictions and observations
(see Appendix 1). Calculated MAPE values are outlined in Table 5. As shown, by employing a mixed-
profile, all activities experience considerable and statistically significant reductions in out-of-sample
prediction errors. The lowest and highest levels of statistical confidence, respectively, are 79% and
93%. Regarding the percent change in MAPE, results indicate that the least improved predictions are
associated with the activity type Other (68%), and the most improvement is observed for Education
(97%). Regarding the absolute change, though, the lowest and highest error reductions are observed,
respectively, for Healthcare (130%) and Work (591%).
8
Previous studies have employed different approaches. Bhat et al. (2013), for instance, adopted the Fractional
Split model proposed in Sivakumar & Bhat (2002).
26
Table 5. Comparing MAPE of out-of-sample predictions
Activity
-profile MAPE
Mixed-profile MAPE
MAPE Reduction
Average
(%)
Std. Err.
(%)
Average
(%)
Std. Err.
(%)
p-Value
(%)
Absolute
Reduction (%)
Percent
Reduction (%)
Educating
373.39
1358.52
10.40
1.83
92.4
362.99***
97.21
Working
612.80
2213.13
21.12
6.24
92.4
591.67***
96.55
Social
395.45
1423.67
15.10
1.94
92.4
380.35***
96.18
Health
136.95
486.90
6.76
0.55
92.4
130.19***
95.06
Recreation
227.81
810.46
11.50
1.09
92.3
216.30***
94.95
Eat Meal
168.76
576.91
13.54
0.82
92.5
155.23***
91.98
Shopping
243.22
784.32
25.66
1.68
93.1
217.56***
89.45
Maintenance
458.40
1442.39
103.64
7.91
90.6
354.76***
77.39
Other
327.52
948.50
103.61
7.66
79.4
223.91*
68.37
Note: † Over 30 sub-samples randomly drawn with replacement from the reserved sample.
Symbols *** and *, respectively, mean significant at 10% and 25% levels.
The second out-of-sample metric is the bar chart shown in Fig. 6, depicting mean values of
observed and predicted activity durations. Although Fig. 6 is similar to Fig. 4 (b), it indeed differs from
Fig. 4 (b) in two ways. First, unlike Fig. 4 (b), zero-consumption observations are not excluded from
Fig. 6. Second, Fig. 4 (b) is generated using the 50% estimation-sample, while Fig. 6 is generated
based on results of averaging activity durations over the 30 sub-samples drawn from the 50%
prediction-sample. As shown, the mixed-profile imposes notably less biases on the duration of Shop
(60% reduction), Recreation (9% reduction), Eat Meal (67% reduction), Maintenance (12% reduction),
and Other (26% reduction); compared to the γ-profile. Mixed-profile is roughly as biased as γ-profile
for the rest of activities (i.e., Work, Education, Social, and Healthcare). While γ-profile overestimates
durations of Shop, Recreation, and Eat meal; it underestimates Maintenance and Other durations.
Regarding bias magnitudes, the first three largest biases are associated with predictions of the γ-profile
27
for Eat Meal, Shop, and Healthcare, and the lowest bias is associated with predictions of the mixed-
profile for Education.
(a) Mandatory activities (b) Non-mandatory activities
Figure 6. Average activity durations suggested by observations and each satiation profile
Finally, overall satiation patterns are compared in Fig. 7, consisting of two parts. First
component depicts different estimated satiation profiles. That is, is set to exp (1) for all goods, and
the utility functions estimated by either of conventional or mixed-profiles are plotted against the
activity durations. Second component shows the observed time-allocation frequencies (i.e., while
pooling results of all 30 sub-samplings). Such frequency distributions provide loose, but useful,
information on upper bound of the point after which full satiation occurs. Per observed distribution of
shop durations, an average individual is less than or equal to 1% likely to wish extending his shopping
duration for more than about 4 hours and 40 minutes per day. In other words, utility profile of Shop is
expected to be fully satiated, at most, after 5 hours. The mixed-profile follows this trend well,
28
estimating marginal utilities of 0.091 at 4 hours, 0.035 at 6 hours, and 0.000 at 24 hours. However,
neither of the conventional profiles could follow such a pattern. For instance, the γ-profile (which is
better fitted compared to the α-profile) shows marginal utilities of 0.331 at 4 hours, 0.230 at 6 hours,
and 0.0.061 at 24 hours. Recreation and Eat Meal have also similar explanations as shop. For these
activities, both observed distributions and mixed-profile plots indicate high-satiated consumptions,
though α- and γ-profiles estimate either low- or medium-satiated consumptions. In case of Work,
conventional profiles suggest almost linear satiation for any activity duration. In case of Education,
Social, and Healthcare, however, conventional profiles suggest similar satiation patterns as the mixed-
profile. The upper bound of full-satiation point for Maintenance and Other is not as clear as other
activities’ in Fig. 7. Also, these choices encompass a more diverse range of activities, making it
difficult to draw a general expectation of actual patterns. Thus, it is not clear from Fig. 7 which profile
is suggesting a more realistic expectation.
29
(a) Work
(b) Education
(c) Shop
(d) Social
(e) Recreation
(f) Healthcare
(g) Eat Meal
(h) Maintenance
(i) Other
Figure 7. Comparison of conventional- and mixed-profiles for each activity type
30
4.3. Discussions on baseline marginal utilities
As evidenced, the mixed- and conventional-profiles estimated in this study suggest distinct
marginal utilities and different prediction accuracies, but similar β parameters. This is not surprizing
since, by definition, various combinations of marginal utilities may result in similar marginal rates of
substitution. Also, many explanatory variables are introduced in of each good and small variations
in each of the β parameters may collectively result in a considerable change in . Such small
variations in β, indeed, are also observed in the models developed in Bhat (2008, p. 52). Following is a
quick discussion on estimated β coefficients.
Age, gender, ethnicity, and household income are among the most influential explanatory
variables introduced in the baseline marginal utilities. The results indicate that, as people age, they tend
to be less willing towards high time-allocations to outdoor recreation, which is consistent with
previous studies (Bhat 2005). Age is also found to have a positive correlation with the time
expenditures on maintenance activities. In addition, senior citizens (i.e., aged 60+), compared to other
individuals, found to be more inclined toward participating in outdoor shopping activities. Srinivasan
et al. (2006) and Mattson (2012) have also come to similar results. Senior citizens are also found more
likely to perform healthcare trips, which is understandable given their age conditions. In view of
gender differences, results suggest a negative correlation between being male and willing to allocate
time to shop. In line with Bhat (2005), moreover, results of the current study show that African-
American citizens are less likely to spend time on outdoor recreation, while Asians are less inclined
toward social activities. The results also indicate that individuals of high-income households tend to
participate more in outdoor recreation and eat meal.
31
5. Summary and conclusions
Theoretically speaking, empirical identification problems are not necessarily unresolvable. The current
paper is an effort to provide methodological avenues and empirical evidence to answer two
fundamental questions on the empirical identification problem in a basic MDCEV formulation, that
are: (1) how one can estimate both α and γ parameters simultaneously in real-life applications, and (2)
what the empirical gains of estimating both α and γ (i.e., a ‘mixed-profile’) could be. A hybrid
optimization routine (HELPME) is proposed to estimate mixed profiles and answer these questions by
comparing mixed and conventional profiles from different aspects.
Essentially speaking, like conventional α- and γ-profiles, a mixed-profile restricts either  or
 parameters on some fixed values. Yet, it adds to accuracy of conventional profiles in two ways.
First, it does not impose unified restrictions to all goods’ satiation curves (i.e.  or  for
all k). As mentioned in Bhat (2008), for some goods, the α-profile is found more accurate than the γ-
profile, while for some others, the γ-profile outperforms the α-profile. Imposing unified restrictions to
all alternative goods of a discrete choice problem, consequently, may force satiation curve of some of
them to follow a functional form that cannot be well-matched to the corresponding accurate profile.
More importantly, the mixed-profile does not impose predefined restrictions (i.e. or )
which are not guaranteed to be accurate enough in all discrete choice problems, supported by definition
of an empirical unidentifiability.
The models in this paper are estimated using a 50% random sub-sample drawn from over 22,000
observations in Atlanta Regional Travel Survey (2011) data, and empirical tests are conducted using
the remaining observations. The tests are designed to investigate: (1) performance of HELPME in
estimating the mixed-profile model, and (2) various in-sample and out-of-sample accuracy measures of
the model. The accuracy metrics include BIC, AIC, likelihood ratio, MAPE (Mean Absolute
32
Percentage Error), overall biasedness, and overall satiation plots.
On the HELPME performance side, the efficiency is explored after completion of both Coarse stage
and Fine stage. Per results, Coarse stage of HELPME could find better and more stable solutions
compared to the original EML. To be specific, the final log-likelihood value is improved by 85% and
its coefficient of variation (CV) is decreased by 54%. Furthermore, CV of both final likelihood and the
model parameters are considerably low in Fine stage. On the side of mixed- and conventional-profiles’
accuracy, the following results are achieved:
BIC, AIC, and likelihood ratio tests indicate that the mixed-profile improves the model’s in-
sample fit for about 1% with a 99.99% level of confidence.
MAPE results show that the model’s out-of-sample prediction errors are drastically reduced
in case of all activities. The reductions range from 68% (for Other) to 97% (for Education)
and are significant at a 90% level of confident, except for Other which is significant at a 75%
level.
Regarding out-of-sample mean predictions, the mixed-profile is shown to impose
considerably less biases to duration of Shop, Recreation, Eat Meal, Maintenance, and Other.
Among these activities, lowest and highest bias reductions correspond to Recreation (9%) and
Eat Meal (67%), respectively.
Although observations and mixed-profiles consistently show high-satiated consumption
patterns for Shop, Recreation, and Eat Meal; conventional profiles suggest low- or medium-
levels of satiation. In addition, conventional profiles fit nearly linear profiles for Work, while
the mixed-profile introduces some levels of satiation.
Due to space and scope restrictions, though, not all aspects of the problem could be explored in one
paper, calling for future studies. First, we have only focused on α and γ profiles of a basic MDCEV
formulation (which adopts the IID assumption). In fact, extent of the gap between mixed- and
33
conventional-profiles may be reduced by introducing scale heterogeneities and/or not-additively-
separable utility functions. Future studies may take such effects into account. Second, future studies
may take advantages of the controlled setting provided by synthesised data and run various scenario-
based analyses. Using empirical data in this paper helps us to make sure that the analyses are credible
from the viewpoints of estimation-burden and predictive-ability. Third, given the meta-heuristic,
gradient-free Coarse stage of HELPME, the proposed paradigm seems to have the potential to reduce
estimation burden of other econometric models, especially those with hard-to-evaluate gradients and
those which can easily fall into the local optima trap. Future studies may also explore HELPME
performance in estimation of such models. Finally, although we used various in-sample and out-of-
sample accuracy measures to explore extent of the predictive gap between conventional and mixed-
profiles, how this gap could affect policies and decisions in real-life applications is still an open
question, which may be addressed in complementary papers.
Acknowledgements
We appreciate Professor Chandra Bhat for sharing source code of the MDCEV model and its
documentations. We would also like to thank three unanimous reviewers whose comments have helped
a lot to strengthen arguments within the paper, as well as directions for future studies.
34
References
Atlanta Regional Commission. (2011). Atlanta Regional Travel Survey Final Report.
Armijo, Larry. “Minimization of functions having Lipschitz continuous first partial derivatives.”
Pacific Journal of mathematics 16, no. 1 (1966): 1-3.
Armstrong, J. Scott, and Fred Collopy. "Error measures for generalizing about forecasting methods:
Empirical comparisons." International journal of forecasting 8, no. 1 (1992): 69-80.
Armstrong, J. Scott. "Evaluating forecasting methods." In Principles of forecasting, pp. 443-472.
Springer US, 2001.
Bhat, Chandra R. “A multiple discretecontinuous extreme value model: formulation and application
to discretionary time-use decisions.”Transportation Research Part B: Methodological 39, no. 8
(2005): 679-707.
Bhat, Chandra R., and Sivaramakrishnan Srinivasan. “A multidimensional mixed ordered-response
model for analyzing weekend activity participation.”Transportation Research Part B:
Methodological 39, no. 3 (2005): 255-278.
Bhat, Chandra R., Sivaramakrishnan Srinivasan, and Sudeshna Sen. “A joint model for the perfect and
imperfect substitute goods case: application to activity time-use decisions.”Transportation
Research Part B: Methodological40, no. 10 (2006): 827-850.
Bhat, Chandra R. “The multiple discrete-continuous extreme value (MDCEV) model: role of utility
function parameters, identification considerations, and model extensions.”Transportation
Research Part B: Methodological 42, no. 3 (2008): 274-303.
Bhat, Chandra R., Konstadinos G. Goulias, Ram M. Pendyala, Rajesh Paleti, Raghuprasad Sidharthan,
Laura Schmitt, and Hsi-Hwa Hu. “A household-level activity pattern generation model with an
application for Southern California.”Transportation 40, no. 5 (2013): 1063-1086.
Birbil, Ş. İlker, and Shu-Chering Fang. “An electromagnetism-like mechanism for global
optimization.”Journal of global optimization 25, no. 3 (2003): 263-282.
Edwards, Yancy D., and Greg M. Allenby. “Multivariate analysis of multiple response data.”Journal of
Marketing Research 40, no. 3 (2003): 321-334.
El-Gallad, A. I., A. A. Sallam, and M. E. El-Hawary. “Swarming of intelligent particles for solving the
nonlinear constrained optimization problem.”International journal of engineering intelligent
systems for electrical engineering and communications 9, no. 3 (2001): 155-164.
35
Eluru, Naveen, Abdul R. Pinjari, Ram M. Pendyala, and Chandra R. Bhat. “A Unified Model System
of Activity Type Choice, Activity Duration, Activity Timing, Mode Choice, and Destination
Choice.” Working Paper, The University of Texas at Austin, Texas, 2009.
Frank, Marguerite, and Philip Wolfe. “An algorithm for quadratic programming.” Naval research
logistics quarterly 3, no. 12 (1956): 95-110.
Greene, William H. Econometric analysis. Pearson Education India, 2003.
Goodwin, Paul, and Richard Lawton. "On the asymmetry of the symmetric MAPE." International
journal of forecasting 15, no. 4 (1999): 405-408.
Hendel, Igal. “Estimating multiple-discrete choice models: An application to computerization
returns.”Review of Economic Studies (1999): 423-446.
Hyndman, Rob J., and Anne B. Koehler. "Another look at measures of forecast accuracy."
International journal of forecasting 22, no. 4 (2006): 679-688.
Kenny, D. A. (1979), “Correlation and causality,” New York: Wiley.
Kim, Jaehwan, Greg M. Allenby, and Peter E. Rossi. “Modeling consumer demand for
variety.” Marketing Science 21, no. 3 (2002): 229-250.
Koehler, A. B. "The asymmetry of the sAPE measure and other comments on the M3-competition."
International Journal of Forecasting 17, no. 4 (2001): 570-574.
LaMondia, Jeffrey, Chandra R. Bhat, and David A. Hensher. "An annual time use model for domestic
vacation travel." Journal of Choice Modelling 1, no. 1 (2008): 70-97.
Liao, ChingJong, YuWei Kuo, Tsui–Ping Chung, and Stephen C. Shih. “Integrating production and
transportation scheduling in a two–stage supply chain.” European Journal of Industrial
Engineering 9, no. 3 (2015): 327-343.
Liu, Yu-Hsin, and Hani S. Mahmassani. “Global maximum likelihood estimation procedure for
multinomial probit (MNP) model parameters.” Transportation Research Part B:
Methodological 34, no. 5 (2000): 419-449.
Liang, Gao, Wang Xiaojuan, Wei, and Chen Yazhou. “A modified algorithm for electromagnetism-
like mechanism.”Journal of Huazhong University of Science and Technology (Nature Science
Edition) 11 (2006): 001.
Makridakis, Spyros, and Michele Hibon. "The M3-Competition: results, conclusions and implications."
International journal of forecasting 16, no. 4 (2000): 451-476.
Manchanda, Puneet, Asim Ansari, and Sunil Gupta. “The ‘shopping basket’: A model for
multicategory purchase incidence decisions.”Marketing Science18, no. 2 (1999): 95-114.
36
Mattson, Jeremy Wade. Travel behavior and mobility of transportation-disadvantaged populations:
Evidence from the National Household Travel Survey. No. DP-258. Upper Great Plains
Transportation Institute, 2012.
Pinjari, Abdul Rawoof, and Chandra Bhat. “A Multiple Discrete–Continuous Nested Extreme Value
(MDCNEV) model: formulation and application to non-worker activity time-use and timing
behavior on weekdays.” Transportation Research Part B: Methodological 44, no. 4 (2010): 562-
583.
Pinjari, Abdul Rawoof, and Chandra Bhat. “An efficient forecasting procedure for Kuhn-Tucker
consumer demand model systems.” Technical paper. Department of Civil & Environmental
Engineering, University of South Florida(2010).
Sivakumar, Aruna, and Chandra Bhat. “Fractional split-distribution model for statewide commodity-
flow analysis.” Transportation Research Record: Journal of the Transportation Research Board
1790 (2002): 80-88.
Sobhani, Anae, Naveen Eluru, and Ahmadreza Faghih-Imani. “A latent segmentation based multiple
discrete continuous extreme value model.” Transportation Research Part B: Methodological 58
(2013): 154-169.
Sobhani, Anae, Naveen Eluru, and Abdul R. Pinjari. “Evolution of Adults’ Weekday Time Use
Patterns from 1992 to 2010: A Canadian Perspective.” In 93rd Annual Meeting of the
Transportation Research Board (TRB), Washington, DC. 2014.
Srinivasan, Nanda, Nancy McGuckin, and Elaine Murakami. "Working retirement: Travel trends of the
aging workforce." Transportation Research Record: Journal of the Transportation Research
Board 1985 (2006): 61-70.
Train, Kenneth E. Discrete choice methods with simulation. Cambridge university press, 2009.
Vasquez Lavin, Felipe, and W. Michael Hanemann. “Functional forms in discrete/continuous choice
models with general corner solution.” Department of Agricultural & Resource Economics,
UCB (2008).
Van Nostrand, Caleb, Vijayaraghavan Sivaraman, and Abdul Rawoof Pinjari. "Analysis of long-
distance vacation travel demand in the United States: a multiple discretecontinuous choice
framework." Transportation 40, no. 1 (2013): 151-171.
Vij, Akshay, and J. Walker. "Hybrid choice models: The identification problem." PhD diss., Edward
Elgar Publishing Limited, (2014).
37
Wales, Terence J., and Alan Donald Woodland. “Estimation of consumer demand systems with
binding non-negativity constraints.”Journal of Econometrics 21, no. 3 (1983): 263-285.
Walker, Joan L., Moshe Ben-Akiva, and Denis Bolduc. "Identification of parameters in normal error
component logit-mixture (NECLM) models." Journal of Applied Econometrics 22, no. 6
(2007): 1095-1125.
Welch, Bernard L. “The generalization of students' problem when several different population
variances are involved.” Biometrika (1947): 28-35.
Yurtkuran, Alkın, and Erdal Emel. “A new hybrid electromagnetism-like algorithm for capacitated
vehicle routing problems.” Expert Systems with Applications 37, no. 4 (2010): 3427-3433.
Yu, Biying, Junyi Zhang, and Akimasa Fujiwara. "Representing in-home and out-of-home energy
consumption behavior in Beijing." Energy Policy 39, no. 7 (2011): 4168-4177.
Zhang, Chunjiang, Xinyu Li, Liang Gao, and Qing Wu. “An improved electromagnetism-like
mechanism algorithm for constrained optimization.” Expert Systems with Applications 40, no.
14 (2013): 5621-5634.
38
Appendix 1
Aggregate error metrics can be generally grouped in three categories: benchmark-based relative
measures, scale-dependent measures, and percentage-based measures. As the name suggests, measures
of the first group use errors obtained from a benchmark forecasting method (e.g. the random walk
naïve forecasts) to scale errors of the concerned method (Hyndman & Koehler 2006). Since no
benchmarking method could be found for generating activity durations, such metrics could not be used
in our study.
Among the most frequently used metrics of the second group, are Root Mean Squared Errors
(RMSE) and Mean Absolute Error (MAE). These metrics are calculated using Eq. 23 and Eq. 24,
where and
denote nth observed and predicted values, and N is the total number of observations.
MAE and RMSE have two prominent disadvantages, like any other scale-dependent measure. First,
they have the same scale as the data (Hyndman & Koehler 2006). Considering this, indeed, that would
be meaningless to use them for comparing prediction results of the γ-profile model against those of the
mixed-profile model, since the two prediction sets have different scales. In addition, neither MAE nor
RMSE distinguish error percentage. For instance, the gap between 5 and 10 (i.e. a 100% error) is
treated equal to the gap between 50 and 55 (i.e. a 10% error). In case of RMSE, it has also been
extensively argued that the metric is highly sensitive to outliers and, consequently, may not be a valid
measure of accuracy (Armstrong & Collopy 1992; Armstrong 2001; and Hyndman & Koehler 2006).
Measures of the third group have neither of the deficiencies mentioned for scale-dependent
metrics. A popular metric of this category is Mean Absolute Percentage Error (MAPE), and is
calculated using Eq. 25. Primary cons of MAPE are (Hyndman & Koehler 2006): (1) it loses accuracy
in cases that some observations are close or equalto zero, (2) it places more weights on positive
errors compared to negative errors in some situations, and (3) it does not consider similar error
measures when forecasts and observations are interchanged. Consider the cases of ( and
39
) and ( and
) for example. The so-called Symmetric MAPE (sMAPE), as
shown in Eq. 26, is introduced mainly to resolve these cons (Hyndman & Koehler 2006). As Hyndman
& Koehler (2006) argued, though, it is not guaranteed that sMAPE is significantly less sensitive to
existence of very-small observations. Furthermore, as Goodwin & Lawton (1999) show, sMAPE puts
different penalties for positive and negative errors in some situations while MAPE does not.
Particularly, the authors state: it [sMAPE] in fact creates a new problem of asymmetry which is more
likely to be of practical concern than the problem resulting from the interchange. Indeed, the
conventional APE [that is, MAPE before aggregation] does not treat single errors above the actual
value any differently from those below it. If the actual value is 100 units, errors of -10 and +10 units
both result in an APE of 10%. The modified APE [that is, sMAPE before aggregation] does treat them
differently. For example, the errors of -10 and +10 units, given above, would result in modified APEs
of 18.18% and 22.2%, respectively. More importantly, level of asymmetricity in sMAPE is argued to
notably depend on the magnitude of percentage errors (Goodwin & Lawton 1999; Koehler 2001; and
Hyndman & Koehler 2006). For instance, Goodwin & Lawton (1999) show that when the forecast
error is +100% the modified APE is three times higher than when the error is -100%. Observing
notably large percentage errors associated with the γ-profile model in our study (as outlined in the
second column of Table 5), we decided to avoid sMAPE and stick with MAPE. Furthermore,
following M3-Competition (Makridakis & Hibon 2000), we also excluded observations which are
literally zero from MAPE calculations to avoid infinity values.
( )
N
PO
RMSE
N
nnn
=
=1
2
(23)
N
PO
MAE
N
nnn
=
=1
(24)
N
OPO
MAPE
N
nnnn
=
=1100
(25)
40
N
POPO
sMAPE
N
nnnn
=+
=1)(200
(26)
... This section is to add more elaborations on the structure and characteristics of HELPME, as well as its limitations and strengths. Before moving to the details, it is worthy to note that the application of HELPME for estimating MDCEV models was not proposed as a way to disentangle the shared roles of k and k completely or to estimate both (Shamshiripour and Samimi, 2017). Instead, it is merely viewed as a technique to fulfil the suggestion of Bhat (2008) to fix one of the two parameters on a proper value and estimate the other. ...
... In other words, HELPME fixes either of the k and k for each k on a good-enough value found by a metaheuristic search, and runs a gradient-based optimization algorithm to estimate the rest of the model parameters. The fixed parameter is decided by the user -although the satiation profiles resulted from fixing either of the parameters by HELPME do not differ significantly in terms of the fit to the data (Shamshiripour and Samimi, 2017). The metaheuristic search conducted in this paper uses a set of 20 initial solutions, which are allowed to randomly fluctuate in the domain of [-3 × µto 3 × µ] where µ stands for all parameters of the model collectively. ...
... Several factors have been highlighted in the literature to explain why women are less likely to use bikes as their mode of transport. A higher tendency towards carrying items such as a purse and performing most of the childrearing activities in the household (e.g., picking up and dropping off children) are among these factors (Akar et al., 2013; Nickkar Note: □ Using HELPME (Shamshiripour and Samimi, 2017) A. Shamshiripour, et al. Transportation Research Part C 117 (2020) We also found the work status of the travelers a significant contributor to the mode choice habits. ...
... Regarding these constraints, Bhat introduced ve speci cations [46] to estimate the MDCEV model while accounting for estimation of, at last, one parameter between and . Recently, this limitation was addressed by Shamshiripour and Samimi [47] that would make the simultaneous estimation of both parameters readily possible. However, in this study, the ve traditional speci cations (as shown in Table 4) are adopted for all the observations in which disability is di erentiated with a 0=1 variable (the classic approach). ...
... The maximum likelihood estimation (MLE) process should be a convex optimization problem, which many previous types of research focusing on [12], [36]. To ensure that a global optimum can be found, it is defined in Equation 18 and 19. ...
Article
Full-text available
To quantify travel demand, it is necessary to understand the travelers’ mode choice behavior. The Logit model is widely used in travel mode choice because of its closed form. Nevertheless, the variance of the utility function is unchanged in Logit-based models, indicating that the perceived error of the traveler on the option is fixed as the utility changes, which is inconsistent with the actual situation. While in Weibit-based models, travelers’ perception error of options grows with the increase of the utility. Moreover, the relative difference is captured, and the asymmetric property exists, which is different from Logit-based models. This paper contributes to the literature by comparing the performances of the Logit-based and Weibit-based models. In this article, six discrete choice models for travel mode choice are discussed based on data of Swiss metro, which includes multinomial Logit model, multinomial Weibit model, and derived models. The Weibit-based models outperform the Logit-based models, considering with the adjusted likelihood ratio index of all models in this paper.
... In transportation there are two main applications: time use models and vehicle choice models. In time use models the discrete dimension is which activities to participate and the continuous dimension is for how (Bhat, 2005;Calastri et al., 2017;Kapur and Bhat, 2007;Shamshiripour and Samimi, 2017). Vehicle choice models treat different brands or type of vehicles as alternatives and the amount of miles driven as the continuous dimension Jian et al., 2017;Shin et al., 2018Shin et al., , 2015You et al., 2014). ...
Article
The main objective of the paper is to develop a model capable of evaluating the societal impact of rail infrastructure investment in Argentina, using a Multiple Discrete Extreme Value Model (MDCEV) estimated on Stated and Revealed preference data. The decision modelled is the mode and port choice at a planning level, where multiple alternatives can be chosen simultaneously. The relevant variables were the Free Alongside Ship (FAS) price, freight transport cost, travel time and lead time, including non-observed heterogeneity in the modelling. As a consequence, the willingness to pay measures that are used for the cost benefit analysis become non-deterministic. To include this effect simulated WTP measurements were included and compared to a deterministic and risk based approach. Two projects were tested and both showed that the deterministic approach gives higher Benefit/Cost ratio. This paper raises the concern that if non-observed heterogeneity is not considered in project evaluation it may provide misleading results and potentially lead to wrong investment priorities for the public sector.
... However, this method has two main drawbacks: First, convergence rate of the method is very slow, and second, it increases the chance of trapping in a local optimum (Bhat, 1997;Zhang, Kuwano, Lee, & Fujiwara, 2009). To account for these concerns, inspiring from Bhat (1997), we have used a simplified version of the HELPME algorithm (Shamshiripour & Samimi, 2017). The algorithm starts by generating 15 random initial solutions around estimation results of the corresponding basic MDCEV (the single-class model), which are allowed to randomly fluctuate in the domain of ½À2  b to 2  b for the baseline marginal utility parameters and ½0:4  c to 1:5  c for the translation parameters. ...
Article
Childhood obesity has become a serious public health challenge during the past few decades, calling for policies to incorporate physical activity into students’ routines. This study is an effort to contribute to the current literature of school travels by analyzing how improving safety of different neighborhoods in Chicago, Illinois would encourage students toward shifting to active modes, and how this interrelationship is affected by the severe weather conditions during the cold winters of the region. The results are complemented by multiple sensitivity analyses to quantify how these shifts would help different students burn extra walking calories (i.e. the extra calories each student burns due to walking more). We estimate multiple discrete continuous extreme value models to understand how flexible a student is in combining his/her most preferred transportation mode with other choices. Various sources of inter-personal heterogeneity are also captured by using a latent-classification framework as well as differentiating the before-school from after-school trip chains to consider the behavioral distinction, explicitly. Several explanatory variables are incorporated into the models, including socio-demographics of students and their household, land-use, crime prevalence, and seasonal/weather conditions. Per the results, improving safety of Chicago from its current condition to the national median, would encourage students to be up to 40% more active. This extra active travel demand would provide obese students aged 14–18 with 18% of the calorie burn they need to lose weight to the obesity cutoff and 13% of the calorie burn required for losing weight from the obesity cutoff to overweight.
Article
To develop an age-friendly society, it is a basic task to deeply understand what and how factors would affect elderly people's daily multiple-activity time-use patterns. In the literature, the impacts of built environments as the physical aspects of living environments have been widely studied. However, the impacts of social environments as the social aspects of living environments on multiple activity participation and duration constrained by time budgets have not been found. To fill this gap, this paper makes efforts to propose various social environmental indicators and incorporate them into a multiple discrete-continuous extreme value (MDCEV) model to test the significance of each variable and to explore the impacts of living environments on elderly people's daily multiple-activity time-use behaviours. In addition to socio-economic compositions, the social environmental indicators are extended to social ties of people, living pressure, social resources distribution, social safety and policy. Then a joint elasticity analysis for multiple activity participation and duration with constraints is proposed for deriving policy and practice implications. The specifications and joint elasticity analysis in Beijing of China which is a representative city affected by Confucian culture and faced with aging problem in Eastern and South-eastern Asia help find out significant factors and their impacts on elderly people's multiple-activity time-use patterns. Policy and practice implications are suggested to improve elderly people's out-of-home activity participation and duration so that an age-friendly society is expected.
Article
Investigating seniors’ daily activity patterns (DAPs) is essential to understand their activity-travel needs. Although some studies have applied machine learning (ML) to derive DAPs, few of them have sought to improve the interpretability of ML. This study aims to predict and interpret seniors’ DAPs in the Chinese context by using ML with a surrogate model. First, a boosted C5.0 algorithm was employed to model seniors’ DAPs, which provided more accurate predictions than the multinomial logit (MNL) model. Second, a rule-based C5.0 algorithm was used as a surrogate model to approximate the prediction function of the boosted C5.0 algorithm and to provide insight into the underlying decision processes in the boosted C5.0 algorithm. The results show that retired men are most likely to lack out-of-home activities. A good residential built environment, especially good walkability and public transit accessibility, increases seniors’ out-of-home activities. This study provides recommendations for increasing seniors’ mobility.
Article
The paper explores the outcomes of an analysis on day by day task‐journey planning conduct of senior citizens by utilizing a modern dynamic model and a family unit travel overview, gathered in Bhubaneswar, Odisha, of India in 2018. The task journey planning display assumes a unique time–space‐constrained planning development. The main commitment of the article is to reveal day by day task‐journey planning conduct through a comprehensive dynamic framework. Numerous behavioural subtleties are revealed by the subsequent empirical model. These incorporate the role that income plays in directing outside time consumption decisions of senior citizens. Senior citizens in the most elevated and least salary classes will, in general, have minor varieties in time consumption decisions than those in middle pay classifications. Generally speaking, the time consumption decisions become progressively steady with expanding age, demonstrating that more task durations and lower task recurrence become progressively predominant with increasing age. Day by day task type and area decisions reveal a reasonable irregular utility‐amplifying level headed conduct of senior residents. Unmistakably expanding spatial availability to different task areas is an urgent factor in characterizing every day outside task interest of senior residents. It is likewise evident that the assorted variety of outside task type decisions decreases with the increase in age and senior citizens are more sensitive to auto journey hour than to travel or non‐mechanised journey hour.
Article
The trademark of motor vehicle use and driving attitude profiles explores new safety upgrades and energy redemption because of the role of driving attitude profiles and driving domains on road mishaps and traffic intensities. Thus, the goal of this research was to identify driving attitude profiles for multiple driving domains (on the basis of speed limits and winter conditions) from real‐world driving information and to investigate how these driving environments have an impact on driving attitude profiles. The study for this research was located in Bhubaneswar, India, where driving information from 64 motorists was collected with Fog Monitor (FM‐120) for at least 12 months. The outcomes indicate that both speed limits and winter states influence driving attitude profiles significantly. Specifically, for winter states, the outcomes provide authentication that motorists drive more quietly (normal speed is 21% lower for zero visibility (visibility < 100 m) than under haze conditions (visibility: 2–5 km), while positive and negative accelerations decrease by 7% and 9%, respectively), when deliberating the effect of speed limit more local streets (limits II, III, and IV streets) are the ones that present more reckless driving profiles in terms of acceleration (30%–40% increase from limit I to limit IV streets). The contribution of this study regards the quantification of the impacts of driving domains on driving attitude profiles, providing evidence that winter conditions significantly affect driving attitude profiles, leading motorists to adjust their driving attitude profiles to the driving domains. However, regarding the speed limit, the differences found in driving attitude profiles seem to be more a consequence of the infrastructure characteristics than an adjustment of driving attitude profiles.
Article
Full-text available
Modeling travelers’ mode choice behavior is an important component of travel demand studies. In an effort to account for day-to-day dynamics of travelers’ mode choice behavior, the current study develops a dynamic random effects logit model to endogenously incorporate the mode chosen for a day into the utility function of the mode chosen for the following day. A static multinomial logit model is also estimated to examine the performance of the dynamic model. Per the results, the dynamic random effects model outperforms the static model in relation to predictive power. According to the accuracy indices, the dynamic random effects model offers the predictive power of 60.0% for members of car-deficient households, whereas the static model is limited to 43.1%. Also, comparison of F1-scores indicates that the predictive power of the dynamic random effects model with respect to active travels is 47.1% whereas that of the static model is as low as 15.0%. The results indicate a significant day-to-day dynamic behavior of transit users and active travelers. This pattern is found to be true in general, but not for members of car-deficient households, who are found more likely to choose the same mode for two successive days.
Article
Full-text available
Multiple response questions, also known as a pick any/J format, are frequently encountered in the analysis of survey data. The relationship among the responses is difficult to explore when the number of response options, J, is large. The authors propose a multivariate binomial probit model for analyzing multiple response data and use standard multivariate analysis techniques to conduct exploratory analysis on the latent multivariate normal distribution. A challenge of estimating the probit model is addressing identifying restrictions that lead to the covariance matrix specified with unit-diagonal elements (i.e., a correlation matrix). The authors propose a general approach to handling identifying restrictions and develop specific algorithms for the multivariate binomial probit model. The estimation algorithm is efficient and can easily accommodate many response options that are frequently encountered in the analysis of marketing data. The authors illustrate multivariate analysis of multiple response data in three applications.
Article
The proportion and number of older workers (those older than 65) are expected to increase significantly in the coming decades, and examining this cohort's travel behavior may provide insight into this potential boom. This study is an exploratory analysis to describe working patterns of the older population today, to examine their work trips, and to make some guesses about how the baby boom generation will be similar to or differ from today's older population. With the available literature on travel by the elderly and data from the 2000 U.S. decennial census and the 2001 National Household Travel Survey, the commute and occupational characteristics of older workers in the work force are explored. Topics covered include projected increase in miles driven by older population groups, trends in labor force participation, occupations of older workers, overall travel patterns, travel time, and mode-to-work characteristics; examination of race and ethnic origin of older workers; and description of the older work-at-home population.
Article
Electromagnetism-like mechanism (EM) is a population-based stochastic global optimization algorithm. A modified EM is proposed. According to the optimizing mechanism of EM, the computation equation for forces exerted on the charged particles was modified and a moving coefficient was introduced into the equation. The experimental results demonstrate that the modified algorithm accelerates the convergence speed and improves the precision of the solutions, especially in optimizing the functions with high dimensions, showing that the functions optimization with high dimensions can be solved by this algorithm well.
Article
This paper is concerned with the research on coordinated scheduling of production and transportation in a two-stage supply chain. The first stage involves scheduling multiple suppliers with various production speeds while the second stage deals with the scheduling job of several vehicles under the assumption that each vehicle is travelling at a different speed and has different transport capacity. The primary objective of this research is to minimise the maximum completion time for all jobs. With that, a new heuristic is proposed and an electromagnetism-like mechanism (EM) algorithm is developed for searching a near optimal or optimal solution. Computational results show that the proposed EM algorithm embedded with the heuristic significantly is better than an existing gendered genetic algorithm, with an average improvement of 20.66% while using much less computation time.
Article
We examine an alternative method to incorporate potential presence of population heterogeneity within the Multiple Discrete Continuous Extreme Value (MDCEV) model structure. Towards this end, an endogenous segmentation approach is proposed that allocates decision makers probabilistically to various segments as a function of exogenous variables. Within each endogenously determined segment, a segment specific MDCEV model is estimated. This approach provides insights on the various population segments present while evaluating distinct choice regimes for each of these segments. The segmentation approach addresses two concerns: (1) ensures that the parameters are estimated employing the full sample for each segment while using all the population records for model estimation, and (2) provides valuable insights on how the exogenous variables affect segmentation. An Expectation–Maximization algorithm is proposed to address the challenges of estimating the resulting endogenous segmentation based econometric model. A prediction procedure to employ the estimated latent MDCEV models for forecasting is also developed. The proposed model is estimated using data from 2009 National Household Travel Survey (NHTS) for the New York region. The results of the model estimates and prediction exercises illustrate the benefits of employing an endogenous segmentation based MDCEV model. The challenges associated with the estimation of latent MDCEV models are also documented.
Article
This paper develops and estimates a multiple discrete continuous extreme value model of household activity generation that jointly predicts the activity participation decisions of all individuals in a household by activity purpose and the precise combination of individuals participating. The model is estimated on a sample obtained from the post census regional household travel survey conducted by the South California Association of Governments in the year 2000. A host of household, individual, and residential neighborhood accessibility measures are used as explanatory variables. The results reveal that, in addition to household and individual demographics, the built environment of the home zone also impacts the activity participation levels and durations of households. A validation exercise is undertaken to evaluate the ability of the proposed model to predict participation levels and durations. In addition to providing richness in behavioral detail, the model can be easily embedded in an activity-based microsimulation framework and is computationally efficient as it obviates the need for several hierarchical sub-models typically used in extant activity-based systems to generate activity patterns.
Article
This study analyzes the annual vacation destination choices and related time allocation patterns of American households. More specifically, an annual vacation destination choice and time allocation model is formulated to simultaneously predict the different vacation destinations that a household visits in a year, and the time (no. of days) it allocates to each of the visited destinations. The model takes the form of a multiple discrete–continuous extreme value (MDCEV) structure. Further, a variant of the MDCEV model is proposed to reduce the prediction of unrealistically small amounts of vacation time allocation to the chosen destinations. To do so, the continuously non-linear utility functional form in the MDCEV framework is replaced with a combination of a linear and non-linear form. The empirical analysis was performed using the 1995 American Travel Survey data, with the United States divided into 210 alternative destinations. The model estimation results provide several insights into the determinants of households’ vacation destination choice and time allocation patterns. Results suggest that travel times and travel costs to the destinations, and lodging costs, leisure activity opportunities (measured by employment in the leisure industry), length of coastline, and weather conditions at the destinations influence households’ destination choices for vacations. The annual vacation destination choice model developed in this study can be incorporated into a larger national travel modeling framework for predicting the national-level, origin–destination flows for vacation travel.
Article
Many problems in scientific research and engineering applications can be decomposed into the constrained optimization problems. Most of them are the nonlinear programming problems which are very hard to be solved by the traditional methods. In this paper, an electromagnetism-like mechanism (EM) algorithm, which is a meta-heuristic algorithm, has been improved for these problems. Firstly, some modifications are made for improving the performance of EM algorithm. The process of calculating the total force is simplified and an improved total force formula is adopted to accelerate the searching for optimal solution. In order to improve the accuracy of EM algorithm, a parameter called as move probability is introduced into the move formula where an elitist strategy is also adopted. And then, to handle the constraints, the feasibility and dominance rules are introduced and the corresponding charge formula is used for biasing feasible solutions over infeasible ones. Finally, 13 classical functions, three engineering design problems and 22 benchmark functions in CEC’06 are tested to illustrate the performance of proposed algorithm. Numerical results show that, compared with other versions of EM algorithm and other state-of-art algorithms, the improved EM algorithm has the advantage of higher accuracy and efficiency for constrained optimization problems.