ArticlePDF Available

Estimating a mixed-profile MDCEV: case of daily activity type and duration

June 2017
Transportation Letters The International Journal of Transportation Research 11(3):1-14

June 2017
11(3):1-14

DOI:10.1080/19427867.2017.1337266

Authors:

The University of Arizona

Abstract: "Multiple Discrete Continuous Extreme Value (MDCEV) has become popular in the past years. Yet, the model suffers from an ‘empirical identification’ issue that is mainly due to inter-relations between two of its parameters, α and γ. This paper presents a hybrid optimization paradigm (named HELPME) to address this issue in a basic MDCEV formulation and take full advantage of the model by estimating a ‘mixed-profile.’ HELPME benefits from a coarse-to-fine search strategy, in which a customized Electromagnetism-like meta-heuristic precedes a gradient-based approach. The Atlanta Regional Travel Survey (2011) is used to empirically analyze performance of HELPME as well as significance of the accuracy gap between the mixed-profile, and α and γ profiles. As part of the results, it is observed that in-sample fit is significantly improved, percentage error of out-of-sample prediction is reduced up to 97% in a 90% confidence level, and bias of out-of-sample predictions are reduced up to 67%." Source Code: You can find source code of HELPME (in the MATA language of Stata) along with the data utilized in the paper at: https://www.dropbox.com/sh/lhq4cirywuv6f2p/AABpvZJ9C6bvOZNggyUrkf31a?dl%C2%A0=%C2%A00

General scheme of HELMPE algorithm

…

General scheme of the proposed random local search

…

(Continued) Estimation results for mixed-profile MDCEV models

…

Marginal effects of each modification on EML

…

Figures - uploaded by Ali Shamshiripour

Content may be subject to copyright.

Content uploaded by Ali Shamshiripour

Content may be subject to copyright.

Estimating a Mixed-profile MDCEV: Case of Daily Activity

Type and Duration

Ali Shamshiripour1 and Amir Samimi 2,*

1PhD Student, Department of Civil and Materials Engineering, University of Illinois at Chicago,

Chicago, IL, USA

2 Assistant Professor, Department of Civil Engineering, Sharif University of Technology, Tehran, Iran

* Corresponding author, Email: asamimi@sharif.edu

Abstract

MDCEV has become popular in the past years. Yet, the model suffers from an ‘empirical

identification’ issue that is mainly due to inter-relations between two of its parameters, α and γ.

This paper presents a hybrid optimization paradigm (named HELPME) to address this issue in a

basic MDCEV formulation and take full advantage of the model by estimating a ‘mixed-profile’.

HELPME benefits from a coarse-to-fine search strategy, in which a customized

Electromagnetism-like (EML) meta-heuristic precedes a gradient-based approach. The Atlanta

Regional Travel Survey (2011) is used to empirically analyse performance of HELPME as well

as significance of the accuracy gap between the mixed-profile, and α- and γ-profiles. As part of

the results, it is observed that in-sample fit is significantly improved, percentage error of out-of-

sample prediction is reduced up to 97% in a 90% confidence level, and bias of out-of-sample

predictions are reduced up to 67%.

Keywords: Maximum-likelihood Estimation; Electromagnetism; MDCEV; HELPME.

1. Introduction

The basic Multiple Discrete Continuous Extreme Value (MDCEV) model has an ‘empirical

identification’ issue between satiation () and translation () parameters, as Bhat (2008) states:

“Clearly, both these effects operate in different ways, and different combinations of their values lead to

different satiation profiles. However, empirically speaking, it is very difficult to disentangle the two

effects separately.” The empirical unidentifiability has been first introduced in Kenny (1979) as a

situation in which although two parameters of a model are theoretically identifiable, it is still very

difficult to estimate them using the data at hand (Kenny 1979 and Vij & Walker 2014). Given this

definition, the identification issue of MDCEV is theoretically

resolvable.

To deal with the empirical identification issue, Bhat (2008) proposed two alternative

approaches based on imposing different restrictions to the model. The first approach is to impose

predefined restrictions on one or both parameters, estimate the restricted profiles, and then choose the

best model (Bhat 2008, p. 9-10). Particularly, an -profile sets to 1 for all goods k, and a -profile

assumes that all values approach zero. We refer to these profiles as ‘conventional,’ since they have

extensively been used in past studies including long-distance travelling analysis (Van Nostrand et al.

2013), household activity-accompaniment patterns (Bhat et al. 2013), energy consumption behaviour

(Yu et al. 2011), and annual vacation time-use (LaMondia et al. 2008). As a second approach, Bhat

(2008, p. 10) also suggests picking either of the profiles as priori and experimentally try to find proper

By definition, two unknown parameters of a model are theoretically unidentifiable if two distinct combinations

of their values could be found, such that they both result in equal (not just similar) distributions of the outcome

variable (Walker 2007 and Vij & Walker 2014). For instance, variance of error term (σ) and coefficients of

attributes (β) in an MNL formulation are known to be theoretically unidentifiable (Train 2009). That is, there is

no way to estimate them both in that the probability distributions are shown to be functions of β/σ rather than β

and σ.

fixed values of (for a -profile) or (for an -profile). The current paper proposes a systematic

search paradigm (HELPME) to fulfil the second proposition of Bhat (2008, p. 10). In fact, HELPME

makes it possible to find the best values in which  (or) can be fixed and take full advantage of the

model.

The remainder of this paper is organized as follows. First, we present some background

knowledge on MDCEV including interpretation of each of its parameters. After that, HELPME is

expressed in detail. Then, an empirical analysis on outdoor activity types and durations is performed

using the 2008 version of the basic MDCEV (i.e., which adopts IID extreme value error terms) and the

Atlanta Regional Travel Survey (ARC 2011) dataset. Particularly, HELPME’s performance and

differences between conventional- and mixed-profiles are discussed in detail. Lastly, the paper

concludes with discussions on remarkable findings of the present research as well as directions for

future studies.

2. MDCEV formulation and interpretation

Traditional discrete choice models deal with choice set of perfectly substitutable alternatives. In many

choice situations, however, the consumer demands for certain amounts of multiple alternatives, given a

limited budget. Various methods have been developed to analyse such choices (Manchanda et al. 1999;

Hendel 1999; Edwards & Allenby 2003; Bhat & Srinivasan 2005; Wales & Woodland 1983; Kim et al.

2002; and Bhat 2005 and 2008). A closed-form formulation is MDCEV, that is first introduced in Bhat

(2005). In Bhat (2008), then, the utility function of that model was modified to a “more easy-to-

interpret and general” function (Bhat 2008). Meanwhile, other variants of the model have been

proposed (Bhat et al. 2006; Pinjari & Bhat 2010; Vasquez & Hanemann 2008; Eluru et al. 2009;

Sobhani et al. 2013; and Sobhani et al. 2014), forming the large family of MDCEV models. The

current study, however, only focuses on the basic MDCEV formulation proposed in Bhat (2008).

In 2008 version of MDCEV, total utility that an individual acquires from consuming  units of

every good k is determined by Eq. 1 (Bhat 2008), assuming: (1) independent observed utility portions

and (2) non-negative marginal utilities. In this function, , , and  are three sets of parameters that

should satisfy certain conditions ( ,  , and  ). The term  captures baseline (i.e.,

pre-consumption) marginal utility of k, that is the slope of its utility function with respect to , when

  (Bhat 2008, p. 7). This parameter’s value is given by Eq. 2, which guarantees positivity. In Eq.

2,  and , respectively, stand for the vector of alternative goods’ characteristics and their

corresponding coefficients. Moreover, denotes the error term of kth good. In the basic MDCEV

model which is explored in the current paper, error terms are assumed to follow an IID extreme value

distribution with the scale parameter of , which may further be normalized to 1 (Bhat 2008).

=

















−











+= K

CU 111)(











(1)

)exp( kkk z



(2)

 is the Satiation parameter and sensitizes the diminishing marginal utility growth rate of each

good as its consumption quantity increases by one unit.  is named Translation parameter as its main

role is to prevent indifference curves from getting asymptotic to axes in the positive orthant, by

shifting the points at which they get asymptotic to the axes (Bhat 2008). In addition,  also influences

satiation patterns by changing slope of indifference curves. The empirical identification problem in

MDCEV, is mainly resulted from such interrelationships between roles of  and .

On the interpretation side,   indicates insatiable consumption

of good k, regardless of

what the value of  is (Bhat 2008). In the case that  remains constant, lower values of  produce

In a hypothetical situation of   for all k, therefore, the model collapses to a simple MNL (Bhat 2008).

more satiable utility profiles (Bhat 2008). Theoretically speaking,   corresponds to the case of

immediate satiation (Bhat 2008). In the case of constant , further, the satiation effect intensifies as

 decreases (Bhat 2008). As mentioned, however, α and γ simultaneously affect rate of the utility

function’s curvature in any version of MDCEV. Given such a shared role, one can hardly interpret

neither α nor γ alone, when they both are subject to variations over alternatives (e.g. in a mixed-

profile). Albeit, this is not a limitation of the model, since the interpretation could simply be

accomplished through development of consumption-utility plots. Such graphs are straight forward to

draw and easy to interpret. One may simply set  to any unified fixed value for all k, plot the utility

value of each good versus consumption amount of that good, and visually interpret the results.

Let  and E denote price of the kth good and the budget, respectively. In this notation, the

budget constraint implies that 



  , where   (Bhat 2008). Employing Lagrangian

approach and KT conditions, then probability function that the estimated “optimal expenditure

allocations” match the expenditure patterns observed in the data is derived as Eq. 3. In this formula, i

and M, respectively, denote the index and the total number of goods that each individual consumes.

( )

0,...,0,,...,,

1−





































=







−M

eeeP M







(3)

Where:

( )

iii p

zV ln1ln1*−











+−+=





(4)













−

iii

ipe







(5)

Having the choice probability function in Eq. 3, unknown parameters of a mixed-profile for the basic

MDCEV model can be estimated by maximizing the model’s log-likelihood

( )

=

PLL 1

, subject

to   ,  , and   for every k. The maximization, in this paper, is conducted using

HELPME to eliminate the need for imposing further constraints (e.g.   in a γ-profile or  

in an α-profile).

3. HELPME

Various methods have been used for likelihood maximization. A popular gradient-based algorithm is

BFGS (Broyden-Fletcher-Goldfarb-Shanno). BFGS is a quasi-Newton method that iteratively utilizes

first order gradients to approximate the Hessian (Greene 2003). However, BFGS dramatically loses its

efficiency as the model becomes more complex, similar to other gradient-based algorithms (Train

2009). On the other hand, are meta-heuristic algorithms that are based on random movements inspired

from ad-hoc rules, instead of gradient information. ElectroMagnetism-Like (EML) is a popular meta-

heuristic that has been proven successful in different disciplines (Yurtkuran & Emel 2010 and Liao et

al. 2015). Efforts (Liu & Mahmassani 2000) have also been made to combine meta-heuristics and

gradient-based methods to boost the optimizers’ capabilities. To overcome the empirical identification

problem of the MDCEV model, we introduced a hybrid paradigm combining EML and BFGS in a

‘coarse-to-fine search’ strategy.

The paradigm is named HELPME (Hybrid Electromagnetism-Like Paradigm for Maximum-

likelihood Estimation). Fig. 1 shows a general scheme of HELPME. Coarse stage of HELPME

employs a customized EML algorithm to find a ‘good’ starting value that is, then, fed into a recursive-

BFGS in Fine stage to fine tune. This strategy, compared to gradient-based optimizers, helps to: (1)

amplify the optimizer’s efficiency in estimation of models with complex likelihood functions and (2)

reduce chance of falling in a local optimum trap. Compared to a standard meta-heuristic, on the other

hand, it helps to: (1) reduce sensitivity of the final estimates’ accuracy to the random numbers, and (2)

guarantee achieving a point in which partial gradients are sufficiently close to zero.

EML is an evolutionary meta-heuristic that imitates natural interactions between electrically

charged particles (Birbil & Fang 2003). The algorithm initiates with a starting-generation of feasible

particles in the domain of problem, with each particle representing feasible values for the optimization

variables (i.e. unknowns of the model to be estimated). Then, certain amount of hypothetical electrical

charge is assigned to each particle to flag ‘good’ and ‘bad’ particles (Birbil & Fang 2003). The

electrical charge of each particle is assumed to be proportionate to the value of log-likelihood

associated with it, and determines the particle’s capacity to attract or repulse others (Birbil & Fang

2003). Electromagnetic force between each pair of particles is then calculated and exerted (Birbil &

Fang 2003), such that good particles attract others while bad particle repulse. These forces, eventually,

lead particles to move and produce the next generation of particles (Birbil & Fang 2003). To further

enhance chance of escaping from local optima, in each iteration, a random local search is also

implemented. The procedure iterates over and over until the ‘best’ particle (i.e. the particle with the

largest log-likelihood) gets sufficiently close to the maximum of log-likelihood function. Coarse stage

of HELPME, in addition to suggesting some customizations in each of these steps, introduces a new

direct-search heuristic (Stubborn Movement) to further boost the algorithm’s efficiency.

In each of the following three sub-sections, a step of EML is explained first, and then the

suggested customizations are discussed. After that, remaining components of HELPME are elaborated.

All the notations are defined in Table 1. Furthermore, marginal effect of each EML customization on

estimating a mixed-profile MDCEV is discussed in section 4.1. Source code of HELPME along with

the data used in this paper is also accessible online

. For a more detailed discussion on EML, readers

may also refer to Birbil & Fang (2003) and Yurtkuran & Emel (2010).

Available at https://www.dropbox.com/sh/lhq4cirywuv6f2p/AABpvZJ9C6bvOZNggyUrkf31a?dl=0

Figure 1. General scheme of HELMPE algorithm

Table 1. Description of notations used in HELPME

Symbol

Type

Description

Module

Variables

Parameters

Random Local

Inter-particle

Interactions

Stubborn

Movement

✓

Number of dimensions of the problem

✓

Number of particles

✓



✓

Upper bound of dth coordinate

✓



✓

Lower bound of dth coordinate

✓



 or



✓

dth coordinate of pth particle in gth generation

✓



 or





✓

Vector of coordinates of pth particle in gth generation

✓





✓

Vector of coordinates of the particle which has the best function value

✓

H󰇛

󰇜

✓

Objective function value associated with pth particle in gth generation

✓



✓

A random number drawn from the standard uniform distribution in gth

generation

✓



✓

A random number drawn from the standard uniform distribution in gth

generation

✓



✓

A random number drawn from the standard uniform distribution for dth

coordinate in gth generation

✓



✓

A predefined parameter between 0 and 1

✓



✓

A generation-specific value of 

✓

Maximum feasible movement length

✓



✓

Maximum feasible movement length for dth coordinate in gth generation

✓

LSIter1, or

LSIter2

✓

Number of local search iterations

✓



✓

A vector denoting the maximum feasible movement range

✓



✓

A parameter between 0 and 1, determining exponent of the distance term

in the ‘total force’ formula of EML

✓





✓

Simulated charge of pth particle in gth generation

✓







✓

Total force exerted to dth dimension of pth particle in gth generation

✓







✓

Vector of total force exerted to pth particle in gth generation

✓



✓

A predefined reduction factor in [0,1], determining the decrement rate of

✓



✓

A predefined reduction factor in [0,1] to alter number of SM iterations

✓



✓

A random number drawn from a standard uniform distribution

✓



✓

1: If   / 0: Otherwise

✓

3.1. Initialization phase

In the original EML algorithm, the initial generation (e.g. all the initial particles) is randomly drawn

from a uniform distribution in the feasible search domain (Birbil & Fang 2003). Although the strategy

of uniform particle distribution helps to avoid local optimum trap, it makes the algorithm inefficient on

the run time side. Considering this, a more complex procedure is adopted in HELPME. The procedure

is elaborated bellow.

First, a simplified-model’s estimates are deployed to generate a ‘good’ feasible solution. This

particle is termed ‘key-particle’ and is incorporated in the starting-generation. We estimated a -profile

MDCEV to find the key-particle. After that, feasible domain of the likelihood-maximization problem

is rescaled, by normalizing each coordinate of the domain to its corresponding coordinate of the key-

particle. All other particles are generated in the rescaled feasible domain

. Notably, the key-particle

also takes the form of an all-ones vector in this domain. In the next step, a few Stubborn Movements

are performed to improve the normalized, random particles. Last step adds a ‘mediocre’ particle to the

set of generated particles. Each coordinate of this particle is equal to the mean value of that coordinate

over other particles. These steps are also depicted in Fig. 1.

3.2. Random local search

Efforts in a random local search are focused on finding a better position for each particle in its own

vicinity in a way that chance of trapping in a local optimum reduces. Let 

 denote current position of

Zhang et al. (2013) evidenced that accuracy of EML is highly sensitive to the relative scale of a problem’s

dimensions. The normalization step proposed in the present paper is intended to dampen the scale differences,

especially between β parameters from one side, and α and γ parameters from the other.

Stubborn Movement is a simple direct-search routine developed as part of HELPME. See section 3.4.

a particle, and 



 denote position of a vicinity point. The random local search in the original EML is

performed using Eqs. 6 and 7, as discussed in the following. First, the maximum feasible movement

length (L) is calculated as a predefined percentage (denoted by δ) of the largest upper bound-to-lower

bound distance among all coordinates. Then, each coordinate of the current point (

) is moved

randomly either toward the upper or the lower bound by a random step size ().

Lxy dg

dg = 2



(6)

 

dluL −= max



(7)

On the other hand, HELPME’s random local search is performed using Eqs. 8 to 10 instead of Eqs. 6

and 7. In Eq. 10, MinIter is the minimum number of generations that is required to evolve. The

customized procedure is different from the original EML in two ways. First, it differentiates L across

each dimension (see Eq. 9). Having a constant L for all d may lead some coordinates of a particle to go

further the feasible domain. This increases chance of assessing infeasible points and, thereby,

diminishes overall performance of the algorithm. Second, it differentiates L from one generation to

another (see Eq. 10). While searching in a broader neighbourhood is more engaging in early

generations, smaller movements are expected in the final runs. As another advantage of Eqs. 8 to 10

over Eqs. 6 and 7, one may avoid calibrating  and  by assigning a large number (e.g. 1) to 

and a small value (e.g. 0.01) to without expecting a considerable performance deterioration.

Last customization, again, deals with the issue of infeasible particles. Despite that the aforementioned

changes reduce chance of producing infeasible particles, feasibility is not guaranteed yet. There are

different strategies to maintain feasibility during a random search, among which we implemented a

common and simple approach that is disregarding infeasible particles (El-Gallad & Sallam 2001). A

general scheme of HELPME’s random local search is also depicted in Fig. 2.

dgdg

dg Lxy = 2



(8)

( )

ddgdg luL −=



(9)

( )

Min Iter

MinIter

e

−



(10)

Figure 2. General scheme of the proposed random local search

3.3. Inter-particle interactions

Simulated particle charges are determined using Eq. 11 that assumes 

 as a function of

relative deviation of each particle’s log-likelihood from the best particle’s log-likelihood. In this way,

higher charges are assigned to particles with more log-likelihoods. The ‘relative deviation’ is also

multiplied by number of dimensions of the problem (D), for the sake of computational stability (see

Birbil & Fang 2003 for more details). Eq. 11 assigns positive charges to all particles, so direction of

forces should be determined through an if-then rule. The total electromagnetic force exerted to each

particle is calculated using Eq. 12, inspired by the Coulomb formula. In addition, the parameter  is

conventionally set to 1. However, many studies (Liang et al. 2006; Zhang et al. 2013) argued that when

a mediocre particle is far from the best particle, the force between them may be considerably lessened

by the quadratic Euclidean distance term in the denominator of Eq. 12. These studies, consequently,

suggested a simplified total force formula, in which the quadratic distance term is removed by setting 

to zero.

( ) ( )

 





















−

−= =

best

gXHXH

XHXH

exp

(11)

( ) ( ) ( )

















−



−

mp p

XHXHif



(12)

HELPME applies a combination of the two aforementioned strategies to increase efficiency of

the heuristic. We found that the simplified total force formula decreases the log-likelihood’s increment

rate in first generations, while it increases tendency of particles to move to better points in final

generations. Therefore, in first generations of HELPME, total forces are calculated using conventional

Coulomb formula (i.e.  ); and if position of the best particle (

) is not improved after a certain

number of iterations, then the parameter  will be decreased by a pre-defined reduction factor󰇛󰇜.

This procedure is depicted in Fig. 1. The proper value of  could be obtained in a calibration process.

Having the total forces, each particle p is then moved toward 



 with a 0 to 1 random step size

󰇛󰇜. The movement is conducted using Eq.13, where RNGdg stands for a vector of components

(calculated by Eq. 14) denoting the maximum feasible movement range toward upper or lower bounds.

( )

gRNG



(13)







−

−

=0,

dgd

dg fiflx

fifxu

RNG

(14)

3.4. Stubborn Movement

The Stubborn Movement (SM) is developed in this paper as a direct-search heuristic. SM considers a

random direction in each of its own iterations, and obstinately searches to find the best point in that

direction. This heuristic is primarily proposed to increase the chance of escaping from local optima.

So, where local optimum trap is not a concern, SM could be avoided to save rum time. In this regard,

number of SM iterations is diminished by a predefined reduction factor () after each call, as shown in

Fig. 1.

The procedure of random direction finding in SM is mainly inspired from the FW (Frank &

Wolfe 1956) algorithm which applies a convex combination of two points’ coordinates: current

position of the particle 󰇛

󰇜 and a second position of it󰇛



󰇜. Though, the key difference between SM

and FW is attributed to the way that 



 is determined. That is, in FW 



 is found based on the first-

order approximation of the objective-function around 

, but in SM it is determined through a random

procedure based on mirror reflection of 

. To be specific, SM determines 



 by mirror-reflecting half

of 

’s current coordinates to the feasible domain’s mid-point, as shown in Eqs. 15 to 17. Reflected

coordinates are randomly chosen using .

( )

dgdd

dg l

uylu

+−

(15)

Where:

( )

*** 1p

dg ybyby −−=



(16)

( )

dgd

dg u

lxu

y−

−

(17)

Figure 3. General scheme of SM algorithm

SM finds the optimal step size using a line-search technique (see Fig. 3) similar to the

backtracking procedure (Armijo 1966). The adopted approach starts with a large initial step size, which

is iteratively decreased until all the points intermediating 

 and 



 are tested and the best possible

point is determined. However, the sufficient decrement criterion (Armijo 1966) of the original

backtracking procedure is not applied in SM, to eliminate the algorithm’s dependency on the gradient

information.

3.5. Coarse stage termination

Most meta-heuristic optimization algorithms terminate after stability of the objective value and/or the

unknown variables, or a predefined number of iterations. Birbil & Fang (2003) suggested a maximum

of 25D generations as a proper stopping criterion in the original EML. However, this criterion is not

appropriate for HELPME. In fact, HELPME could stop the meta-heuristic procedure in Coarse stage

far earlier than 25 iterations per dimension, because of the improvements in EML and, more

importantly, a preceding Fine stage. If the Coarse stage terminates too early, albeit, the optimization

process might not take full advantage of the heuristic. Thus, the following rules are set in HELPME to

terminate the Coarse stage:

• At least [1.5 to 3.5] generations are evolved ( [1.5 to 3.5] ).

• The best particle does not change in 0.1  successive generations.

• The CPU-time elapsed to improve the objective value by one unit in evolving a generation

exceeds a critical value.

If the first criterion is met along with either the second or third rule, then the process would be

stopped.

3.6. Fine stage

This stage is intended to improve the best particle found in the last generation by a gradient-based

routine. Here we employed a recursive-BFGS which is depicted in Fig. 1. To be specific, Satiation and

Translation parameters are sequentially fixed in their current values, while other parameters are

improved by LSIter1 iterations of BFGS. The sequence continues until either the relative change in the

log-likelihood value or the maximum relative change in the parameters reaches pre-specified critical

values. Having met the stopping criteria, either Translation or Satiation parameters are fixed again and

the rest of the parameters are estimated precisely using the standard BFGS. Fixing Translation or

Satiation parameters in this step, results in a Satiation-based or a Translation-based mixed-profile,

respectively.

4. Empirical analysis

This section is devoted to comprehensive empirical analyses on: (1) performance of

HELMPE’s Coarse and Fine stages in estimating mixed-profiles for a basic MDCEV formulation, and

(2) Comparing mixed- and conventional-profiles from various perspectives, including different in-

sample accuracy metrics, mean percentage of out-of-sample prediction errors, biasedness of out-of-

sample predictions, and overall satiation curves. All two-sample comparisons are conducted using the

Welch’s t-test (Welch 1947). A quick introduction of the data is presented below, preceded by a

detailed discussion on the empirical analyses.

Due to concerns on practical credibility of the results

, experiments in this section are

conducted using an empirical data, Atlanta Regional Travel Survey (ARC 2011). The dataset is fairly

recent, and is rich in terms of diversity of explanatory variables and number of observations. Also, it is

freely available, making it easy for interested readers to verify the results. Total number of 25,810

individuals participated in the survey, 22,249 of whom reported at least one outdoor activity in the

survey day. We randomly selected fifty percent of the 22,249 observations for model estimation and

reserved the rest for comparing out-of-sample prediction experiments.

Figure 4. Outdoor activity types and durations in the 50% estimation-sample

Four MDCEV models for outdoor activity types and durations are estimated. The activities are

aggregated into nine categories, namely Work, Education, Shop, Social, Recreation, Healthcare, Eat

Meal, Maintenance, and Other. Fig. 4 illustrates time expenditure patterns of the 50% estimation-

sample. All the models adopt the basic structure of MDCEV which assumes IID extreme value error

Synthetic data fail to truthfully represent real-life situations and, thereby, undermines practical findings such as

magnitude of change in likelihood value, run time, and prediction power.

terms. The models are categorized, based on their satiation profiles, in two main classes. The first class

contains models with conventional - and -profiles, which are routinely estimated using BFGS. The

second class includes Satiation-based and Translation-based mixed-profile models. Models of this

class are estimated five times using HELPME and different random numbers provided to its Coarse

stage. The algorithm proposed by Pinjari & Bhat (2010)

is used to predict consumption quantities

(). To obtain comparable results, random inputs to that algorithm are equally set among all the

experiments for each run. All the models have identical explanatory variables which are described in

Table 2. The estimation results are outlined in Table 3. The reader will note that, in Table 3, no t-

statistic has been reported for γ parameters in the Satiation-based mixed profile, or for α parameters in

the Translation-based mixed profile. The standard errors are not reported to restate the fact that a

mixed profile is not intended to reduce the dependency level of α and γ. These parameters are

interrelated in nature, regardless of the value on which one is fixed. Indeed, γ parameters in a Satiation-

based mixed profile (α parameters in a Translation-based mixed profile) should still be treated as fixed

values, similar to the case of conventional profiles.

The algorithm is originally proposed to predict within a -profile framework. Pinjari & Bhat (2010), also,

suggest modifications using the bisection technique to make the algorithm compatible with cases where  and

 are both subject to change. We use the modified version for prediction of a mixed-profile. Instead of the

bisection algorithm, though, the Newton’s algorithm is used to eliminate the need for defining proper upper and

lower bounds for λ. Also, the λ from previous iterations is used as an initial value inputted to each iteration.

Table 2. Description of independent variables

Name

Definition

Average

Std. Dev.

HighEdu

1: If the individual has a college degree/ 0: Otherwise

0.397

0.489

Worker

1: If the individual works/ 0: Otherwise

0.554

0.497

Age

Age of the individual

38.412

20.870

Student

1: If the individual is a student/ 0: Otherwise

0.047

0.212

Elderly

1: If the individual is older than 60/ 0: Otherwise

0.155

0.362

Male

1: If the individual is male/ 0: Otherwise

0.472

0.499

TeleWork

1: If the individual works at home/ 0: Otherwise

0.050

0.219

Homemaker

1: If the individual is a homemaker/ 0: Otherwise

0.038

0.191

White

1: If the individual is White 0: Otherwise

0.733

0.442

African

1: If the individual is African-American/ 0: Otherwise

0.191

0.393

Asian

1: If the individual is Asian/ 0: Otherwise

0.020

0.140

Nchild

Number of household children

1.099

1.243

HighIncome

1: If yearly income of the individual’s household is more

than 60,000 dollars/ 0: Otherwise

0.669

0.470

Table 3. Estimation results for mixed-profile MDCEV models

Activity

Variable

-profile

Satiation-based

Mixed-profile

-profile

Translation-based

Mixed-profile

Coeff. (t-Value)

Work

Constant

–

HighEdu

-0.21 (-4.92)

-0.19*** (-4.77**)

-0.20 (-4.88)

-0.19** (-4.77**)

Worker

6.29 (17.70)

6.30*** (17.74***)

6.26 (17.62)

6.30** (17.74**)

Education

Constant

7.05 (19.76)

7.09*** (19.86***)

7.09 (19.88)

7.09** (19.86**)

Age

-0.09 (-31.91)

-0.08*** (-31.99***)

-0.09 (-32.15)

-0.08** (-32.00**)

HighEdu

-1.69 (-10.27)

-1.69*** (-10.31***)

-1.70 (-10.31)

-1.69** (-10.31**)

Student

0.96 (12.02)

0.96*** (12.02***)

0.96 (12.02)

0.96** (12.02**)

Shop

Constant

5.67 (15.89)

5.62*** (15.76***)

5.61 (15.71)

5.62*** (15.76***)

Elderly

0.62 (10.69)

0.62** (10.69**)

0.60 (10.45)

0.62*** (10.69***)

Male

-0.34 (-7.67)

-0.34*** (-7.90***)

-0.34 (-7.74)

-0.34*** (-7.90***)

TeleWork

2.27 (16.10)

2.24** (15.94**)

2.21 (15.71)

2.24** (15.94**)

Homemaker

0.63 (7.22)

0.65** (7.45**)

0.61 (7.03)

0.65*** (7.44***)

HighIncome

-0.21 (-4.65)

-0.22*** (-4.91***)

-0.21 (-4.70)

-0.22*** (-4.91***)

Nchild

-0.17 (-8.42)

-0.16*** (-7.92***)

-0.16 (-7.89)

-0.16*** (-7.92***)

Social

Constant

4.79 (13.46)

4.76*** (13.39***)

4.76 (13.40)

4.76*** (13.39***)

Asian

-0.50 (-1.98)

-0.49*** (-1.97***)

-0.49 (-1.96)

-0.49*** (-1.97***)

HighEdu

-0.36 (-5.72)

-0.35*** (-5.67***)

-0.35 (-5.68)

-0.35*** (-5.67***)

Worker

-0.20 (-3.23)

-0.19*** (-3.06***)

-0.20 (-3.13)

-0.19** (-3.06**)

TeleWork

2.17 (13.28)

2.17** (13.26**)

2.13 (13.05)

2.17** (13.26**)

Recreation

Constant

4.66 (12.80)

4.64*** (12.76***)

4.63 (12.74)

4.64*** (12.76***)

African

-0.36 (-4.18)

-0.36** (-4.16**)

-0.36 (-4.14)

-0.36*** (-4.15***)

Age

-0.01 (-5.25)

-0.01** (-5.18**)

-0.01 (-5.15)

-0.01*** (-5.18***)

Worker

-0.35 (-5.08)

-0.34** (-4.91**)

-0.34 (-4.91)

-0.34*** (-4.91***)

Student

-0.42 (-2.71)

-0.41** (-2.64**)

-0.41 (-2.64)

-0.41*** (-2.64***)

TeleWork

2.37 (14.61)

2.38** (14.64**)

2.34 (14.41)

2.38** (14.64**)

HighIncome

0.42 (5.91)

0.41** (5.79**)

0.41 (5.77)

0.41*** (5.79***)

Healthcare

Constant

4.19 (11.56)

4.18*** (11.54***)

4.18 (11.55)

4.18*** (11.54***)

White

-0.26 (-3.32)

-0.25*** (-3.26***)

-0.26 (-3.28)

-0.25*** (-3.26***)

Elderly

1.02 (12.42)

1.01*** (12.30***)

1.01 (12.34)

1.01*** (12.30***)

Male

-0.43 (-5.67)

-0.42*** (-5.60***)

-0.42 (-5.64)

-0.42*** (-5.60***)

Worker

-0.40 (-5.13)

-0.39*** (-5.03***)

-0.40 (-5.10)

-0.39*** (-5.03***)

TeleWork

2.40 (12.74)

2.41** (12.73**)

2.37 (12.56)

2.41** (12.73**)

Eat Meal

Constant

4.25 (11.76)

4.22*** (11.68***)

4.22 (11.67)

4.22*** (11.68***)

White

0.61 (8.70)

0.60** (8.62**)

0.60 (8.54)

0.60*** (8.62***)

Elderly

0.18 (2.52)

0.19** (2.62**)

0.19 (2.55)

0.19** (2.62**)

TeleWork

2.25 (15.04)

2.26** (15.06**)

2.21 (14.78)

2.26** (15.06**)

HighIncome

0.28 (4.69)

0.28*** (4.77***)

0.28 (4.63)

0.28*** (4.77***)

Nchild

-0.23 (-9.19)

-0.22*** (-8.99***)

-0.23 (-8.90)

-0.22*** (-8.99***)

Maintenance

Constant

4.60 (12.85)

4.57*** (12.78***)

4.57 (12.77)

4.57*** (12.78***)

Age

0.02 (12.65)

0.01*** (11.99***)

0.01 (12.04)

0.01*** (11.99***)

Worker

0.31 (6.57)

0.33*** (6.96***)

0.34 (7.16)

0.33** (6.95**)

TeleWork

2.27 (16.52)

2.28** (16.56**)

2.28 (16.52)

2.28** (16.56**)

Other

Constant

5.48 (15.45)

5.44*** (15.32***)

5.44 (15.33)

5.44*** (15.32***)

African

0.16 (3.08)

0.15*** (2.97***)

0.15 (3.03)

0.15** (2.97**)

Male

-0.21 (-5.12)

-0.20*** (-4.94***)

-0.21 (-5.00)

-0.20*** (-4.94***)

TeleWork

1.88 (13.03)

1.90** (13.13**)

1.86 (12.87)

1.90** (13.13**)

Note: Symbols ***, **, and *, respectively, mean that the corresponding CV is less than 1E-4, 1E-3, and 1E-1.

Table 3. (Continued) Estimation results for mixed-profile MDCEV models

Activity

Variable

-profile

Satiation-based

Mixed-profile

-profile

Translation-based

Mixed-profile

Coeff. (t-Value)

Work

Satiation

Parameters

0.98 (5.73)

-8.90 (-50.50)

0.00 (–)

-11.85 (–)

Education

0.98 (6.26)

-9.32 (-63.08)

0.00 (–)

-10.13 (–)

Shop

0.73 (25.30)

-2.97*(-141.43*)

0.00 (–)

-2.99*(–)

Social

0.87 (20.83)

-0.04*(-1.11*)

0.00 (–)

-0.04*(–)

Recreation

0.83 (20.25)

-4.75*(-138.85*)

0.00 (–)

-4.83*(–)

Healthcare

0.87 (13.85)

-0.33*(-5.89*)

0.00 (–)

-0.33*(–)

Eat Meal

0.75 (21.41)

-8.79*(-335.01*)

0.00 (–)

-8.52*(–)

Maintenance

0.74 (26.99)

0.57**(26.26***)

0.00 (–)

0.57**(–)

Other

0.66 (23.34)

0.38***(19.44***)

0.00 (–)

0.38**(–)

Work

Translation

Parameters

1.00 (–)

35687.56 (–)

6529.32 (21161.24)

46268.96 (268217.86)

Education

1.00 (–)

23423.83 (–)

2063.36 (14966.08)

25516.58 (172887.15)

Shop

1.00 (–)

179.01*(–)

33.46 (1014.80)

180.61*(7246.90*)

Social

1.00 (–)

232.04*(–)

211.07 (3955.86)

232.03*(4287.10*)

Recreation

1.00 (–)

805.23*(–)

115.14 (2394.70)

820.98*(21753.72*)

Healthcare

1.00 (–)

186.14*(–)

127.24 (1777.18)

186.08*(2662.12*)

Eat Meal

1.00 (–)

596.06*(–)

45.68 (1131.14)

582.78*(20554.35*)

Maintenance

1.00 (–)

6.99**(–)

30.75 (821.18)

6.99**(151.81**)

Other

1.00 (–)

6.79***(–)

14.74 (432.47)

6.79***(175.14***)

Scale Parameter

1.00 (–)

Log-likelihood Value

-93,613.35

-87,882.19***

-88,766.93

-87,882.12***

Bayesian Information

Criterion (BIC)

-187,729.79

-176,351.32***

-178,036.96

-176,351.11***

Akaike Information

Criterion (AIC)

-187,334.69

-175,890.37***

-177,641.86

-175,890.23***

CPU Time (minutes)

30.51

70.29*

45.71

69.76*

Coarse Stage Parameters

P=5, LSIter=5, 

=0.9, =0.9, =1, =0.01

Fine Stage Termination

Parameters

Log-likelihood tolerance: 1E-6,

Maximum Parameter Tolerance: 1E-6

Note: Symbols ***, **, and *, respectively, mean that the corresponding CV is less than 1E-4, 1E-3, and 1E-1.

4.1. Performance of HELPME in MDCEV estimation

As mentioned above, providing rough solutions and being considerably sensitive to random

inputs are among the most important disadvantages of meta-heuristics compared to gradient-based

approaches. This section discusses performance of HELPME in estimating mixed-profiles for a basic

MDCEV. The performance is explored after completion of each stage.

After termination of Coarse stage, we explored three aspects, namely optimality of the final

log-likelihood value, sensitivity of the final log-likelihood value to random inputs, and prospect of

improving the best particle. The first two aspects are quantified by, respectively, average and

coefficient of variation (CV) of final log-likelihoods in different runs, and the third one is measured by

calculating portion of generations with an improved best-particle. Each of these metrics is calculated

for original EML and three of its variants to capture marginal effects of proposed customizations. The

results are summarized in Table 4. In this table, bolded rows are resulted from recommended stopping

criteria, while other rows are provided letting the algorithm run for a longer or shorter time than usual.

Per results, applying the new random local search causes a 59% increase in final log-likelihood (i.e.,

from -617,885 to -251,346) at an 87% confidence level, and over 51% decrease in the final log-

likelihood’s CV (i.e., from 54% to 26%). Applying the new initialization phase, further, results in a

64% growth of final log-likelihood (i.e., from -251,346 to -89,169) at a 94% confidence level, and a

99% decline in CV of the final log-likelihood (i.e., from 26% to 0.2%). The proposed force calculation

method also causes a 0.17% improvement of final log-likelihood (i.e., from -89,169 to -89,021), which

is significant at an 85% level of confidence. Lastly, the best particle is about 22% and 9% more likely

to be improved as new initialization phase and Force Calculation method are used, respectively. In

addition, Fig. 5 depicts log-likelihood versus the heuristic’s generation number in each of the five runs.

As can be seen, the 5th run provides the best EML estimates in terms of log-likelihood value. In this

run, log-likelihood of EML’s best initial particle is about -900,000, that is almost ten times the starting

point of HELPME (i.e., Key-particles’ log-likelihood). This figure also depicts considerable

improvement of final log-likelihood’s stability, when HELPME’s Coarse stage replaces the original

EML. Final log-likelihood of original EML’s particles range roughly between -300,000 to -1,100,000,

while that of HELPME’s Coarse stage range between -89,000 to -89,500.

Table 4. Marginal effects of each modification on EML

Modification

Convergence

Log-likelihood Trend

Initialization and SM

Random Local Search

Force Calculation

Number of

Generations

Average CPU Time

(minutes)

Log-likelihood Value

Portion of Generations with

Improved

‘Best Particle’

Average

CV (%)

Average

CV (%)



120

2.57

-1,393,434

1.17%

167

1600

34.62

-617,885

10.26%



✓



120

2.83

-591,849

4.67%

650

17.69

-251,346

8.87%

✓



105

5.12

-89,169

0.2

32.67%

664

19.87

-88,697

0.04

17.10%

✓

106

5.23

-89,021

0.2

41.82%

648

19.95

-88,708

0.08

17.31%

(a) Original EML

(b) Coarse stage of HELPME

Figure 5. Maximization trend of the unconstrained model’s log-likelihood

After termination of Fine stage, we explored two aspects: stability of the model’s estimates, and

stability of the final likelihood value. Based on the model estimation results in Table 3, the CV of all

the HELPME’s outputs are considerably low. For mixed-profile models, the CVs of log-likelihood at

convergence are less than 1E-4. Besides, the CVs of all parameters of the baseline marginal utility are

less than 1E-3. Similar results are also achieved for other outputs (see t-values and CPU-times in Table

3), revealing the ignorable sensitivity of the HELPME’s accuracy to random numbers.

4.2. Distinctions between mixed-profile, α-profile and -profile

Estimating a mixed-profile model costs extra computational time (see Table 3). However, the mixed-

profile outperforms conventional profiles on the accuracy side, as evidenced in this section. A detailed

discussion is presented on results of various model selection metrics, namely: (1) BIC, AIC, and

likelihood ratio test as in-sample metrics, (2) mean absolute percentage of prediction errors, and

biasedness of predictions as out-of-sample measures, and (3) overall satiation patterns.

BIC and AIC are two widely-used model selection metrics based solely on in-sample

information. Models with higher BIC and/or AIC values are recognized to have better in-sample fits.

Table 3 outlines values of these metrics, setting the number of parameters of MDCEV models with the

conventional and mixed-profiles to, respectively, 54 and 63. Per both BIC and AIC results in Table 3,

the γ-profile model has a 5%-better fit compared to the α-profile, and the Translation-based mixed-

profile is about 1% better-fitted compared to the γ-profile. Moreover, likelihood ratio statistic is

performed to capture statistical significance of the gap between in-sample fits of the -profile MDCEV

and the Translation-based mixed-profile MDCEV. Likelihood ratio is calculated as 1769.62, which is

statistically significant at a 99.99% level. As none of these in-sample tests detect a significant

difference between the two mixed-profiles, we termed ‘Translation-based mixed-profile’ shortly as

‘mixed-profile’ in reminder of the paper.

Out-of-sample prediction power of -profile and mixed-profile models are investigated using

the reserved 50% prediction-sample. 30 sub-samples are randomly drawn with replacement from the

prediction-sample, each of which containing 20 percent of the prediction-sample. Then, observed

activity durations are recorded for each sub-sample. After that, the two models are applied on each

sub-sample, producing total of 30 sets of activity duration prediction for each model. All predictions

are performed using 1000 sets of simulated error terms (), inputted to the prediction algorithm (see

Pinjari & Bhat 2010). Moreover, stochastic budget determinations are avoided

by using observed total

activity durations, for the sake of simplicity. Prediction records are compared using two different out-

of-sample metrics, as discussed below.

The first out-of-sample metric is MAPE (Mean Absolute Percentage Error), which is argued to

be a reliable aggregate representative of the individual-level gaps between predictions and observations

(see Appendix 1). Calculated MAPE values are outlined in Table 5. As shown, by employing a mixed-

profile, all activities experience considerable and statistically significant reductions in out-of-sample

prediction errors. The lowest and highest levels of statistical confidence, respectively, are 79% and

93%. Regarding the percent change in MAPE, results indicate that the least improved predictions are

associated with the activity type Other (68%), and the most improvement is observed for Education

(97%). Regarding the absolute change, though, the lowest and highest error reductions are observed,

respectively, for Healthcare (130%) and Work (591%).

Previous studies have employed different approaches. Bhat et al. (2013), for instance, adopted the Fractional

Split model proposed in Sivakumar & Bhat (2002).

Table 5. Comparing MAPE of out-of-sample predictions

Activity

-profile MAPE

Mixed-profile MAPE

MAPE Reduction

Average †

(%)

Std. Err.

(%)

Average †

(%)

Std. Err.

(%)

p-Value

(%)

Absolute

Reduction (%)

Percent

Reduction (%)

Educating

373.39

1358.52

10.40

1.83

92.4

362.99***

97.21

Working

612.80

2213.13

21.12

6.24

92.4

591.67***

96.55

Social

395.45

1423.67

15.10

1.94

92.4

380.35***

96.18

Health

136.95

486.90

6.76

0.55

92.4

130.19***

95.06

Recreation

227.81

810.46

11.50

1.09

92.3

216.30***

94.95

Eat Meal

168.76

576.91

13.54

0.82

92.5

155.23***

91.98

Shopping

243.22

784.32

25.66

1.68

93.1

217.56***

89.45

Maintenance

458.40

1442.39

103.64

7.91

90.6

354.76***

77.39

Other

327.52

948.50

103.61

7.66

79.4

223.91*

68.37

Note: † Over 30 sub-samples randomly drawn with replacement from the reserved sample.

Symbols *** and *, respectively, mean significant at 10% and 25% levels.

The second out-of-sample metric is the bar chart shown in Fig. 6, depicting mean values of

observed and predicted activity durations. Although Fig. 6 is similar to Fig. 4 (b), it indeed differs from

Fig. 4 (b) in two ways. First, unlike Fig. 4 (b), zero-consumption observations are not excluded from

Fig. 6. Second, Fig. 4 (b) is generated using the 50% estimation-sample, while Fig. 6 is generated

based on results of averaging activity durations over the 30 sub-samples drawn from the 50%

prediction-sample. As shown, the mixed-profile imposes notably less biases on the duration of Shop

(60% reduction), Recreation (9% reduction), Eat Meal (67% reduction), Maintenance (12% reduction),

and Other (26% reduction); compared to the γ-profile. Mixed-profile is roughly as biased as γ-profile

for the rest of activities (i.e., Work, Education, Social, and Healthcare). While γ-profile overestimates

durations of Shop, Recreation, and Eat meal; it underestimates Maintenance and Other durations.

Regarding bias magnitudes, the first three largest biases are associated with predictions of the γ-profile

for Eat Meal, Shop, and Healthcare, and the lowest bias is associated with predictions of the mixed-

profile for Education.

(a) Mandatory activities (b) Non-mandatory activities

Figure 6. Average activity durations suggested by observations and each satiation profile

Finally, overall satiation patterns are compared in Fig. 7, consisting of two parts. First

component depicts different estimated satiation profiles. That is,  is set to exp (1) for all goods, and

the utility functions estimated by either of conventional or mixed-profiles are plotted against the

activity durations. Second component shows the observed time-allocation frequencies (i.e., while

pooling results of all 30 sub-samplings). Such frequency distributions provide loose, but useful,

information on upper bound of the point after which full satiation occurs. Per observed distribution of

shop durations, an average individual is less than or equal to 1% likely to wish extending his shopping

duration for more than about 4 hours and 40 minutes per day. In other words, utility profile of Shop is

expected to be fully satiated, at most, after 5 hours. The mixed-profile follows this trend well,

estimating marginal utilities of 0.091 at 4 hours, 0.035 at 6 hours, and 0.000 at 24 hours. However,

neither of the conventional profiles could follow such a pattern. For instance, the γ-profile (which is

better fitted compared to the α-profile) shows marginal utilities of 0.331 at 4 hours, 0.230 at 6 hours,

and 0.0.061 at 24 hours. Recreation and Eat Meal have also similar explanations as shop. For these

activities, both observed distributions and mixed-profile plots indicate high-satiated consumptions,

though α- and γ-profiles estimate either low- or medium-satiated consumptions. In case of Work,

conventional profiles suggest almost linear satiation for any activity duration. In case of Education,

Social, and Healthcare, however, conventional profiles suggest similar satiation patterns as the mixed-

profile. The upper bound of full-satiation point for Maintenance and Other is not as clear as other

activities’ in Fig. 7. Also, these choices encompass a more diverse range of activities, making it

difficult to draw a general expectation of actual patterns. Thus, it is not clear from Fig. 7 which profile

is suggesting a more realistic expectation.

(a) Work

(b) Education

(d) Social

(e) Recreation

(f) Healthcare

(g) Eat Meal

(h) Maintenance

(i) Other

Figure 7. Comparison of conventional- and mixed-profiles for each activity type

4.3. Discussions on baseline marginal utilities

As evidenced, the mixed- and conventional-profiles estimated in this study suggest distinct

marginal utilities and different prediction accuracies, but similar β parameters. This is not surprizing

since, by definition, various combinations of marginal utilities may result in similar marginal rates of

substitution. Also, many explanatory variables are introduced in  of each good and small variations

in each of the β parameters may collectively result in a considerable change in . Such small

variations in β, indeed, are also observed in the models developed in Bhat (2008, p. 52). Following is a

quick discussion on estimated β coefficients.

Age, gender, ethnicity, and household income are among the most influential explanatory

variables introduced in the baseline marginal utilities. The results indicate that, as people age, they tend

to be less willing towards high time-allocations to outdoor recreation, which is consistent with

previous studies (Bhat 2005). Age is also found to have a positive correlation with the time

expenditures on maintenance activities. In addition, senior citizens (i.e., aged 60+), compared to other

individuals, found to be more inclined toward participating in outdoor shopping activities. Srinivasan

et al. (2006) and Mattson (2012) have also come to similar results. Senior citizens are also found more

likely to perform healthcare trips, which is understandable given their age conditions. In view of

gender differences, results suggest a negative correlation between being male and willing to allocate

time to shop. In line with Bhat (2005), moreover, results of the current study show that African-

American citizens are less likely to spend time on outdoor recreation, while Asians are less inclined

toward social activities. The results also indicate that individuals of high-income households tend to

participate more in outdoor recreation and eat meal.

5. Summary and conclusions

Theoretically speaking, empirical identification problems are not necessarily unresolvable. The current

paper is an effort to provide methodological avenues and empirical evidence to answer two

fundamental questions on the empirical identification problem in a basic MDCEV formulation, that

are: (1) how one can estimate both α and γ parameters simultaneously in real-life applications, and (2)

what the empirical gains of estimating both α and γ (i.e., a ‘mixed-profile’) could be. A hybrid

optimization routine (HELPME) is proposed to estimate mixed profiles and answer these questions by

comparing mixed and conventional profiles from different aspects.

Essentially speaking, like conventional α- and γ-profiles, a mixed-profile restricts either  or

 parameters on some fixed values. Yet, it adds to accuracy of conventional profiles in two ways.

First, it does not impose unified restrictions to all goods’ satiation curves (i.e.  or   for

all k). As mentioned in Bhat (2008), for some goods, the α-profile is found more accurate than the γ-

profile, while for some others, the γ-profile outperforms the α-profile. Imposing unified restrictions to

all alternative goods of a discrete choice problem, consequently, may force satiation curve of some of

them to follow a functional form that cannot be well-matched to the corresponding accurate profile.

More importantly, the mixed-profile does not impose predefined restrictions (i.e.   or  )

which are not guaranteed to be accurate enough in all discrete choice problems, supported by definition

of an ‘empirical’ unidentifiability.

The models in this paper are estimated using a 50% random sub-sample drawn from over 22,000

observations in Atlanta Regional Travel Survey (2011) data, and empirical tests are conducted using

the remaining observations. The tests are designed to investigate: (1) performance of HELPME in

estimating the mixed-profile model, and (2) various in-sample and out-of-sample accuracy measures of

the model. The accuracy metrics include BIC, AIC, likelihood ratio, MAPE (Mean Absolute

Percentage Error), overall biasedness, and overall satiation plots.

On the HELPME performance side, the efficiency is explored after completion of both Coarse stage

and Fine stage. Per results, Coarse stage of HELPME could find better and more stable solutions

compared to the original EML. To be specific, the final log-likelihood value is improved by 85% and

its coefficient of variation (CV) is decreased by 54%. Furthermore, CV of both final likelihood and the

model parameters are considerably low in Fine stage. On the side of mixed- and conventional-profiles’

accuracy, the following results are achieved:

• BIC, AIC, and likelihood ratio tests indicate that the mixed-profile improves the model’s in-

sample fit for about 1% with a 99.99% level of confidence.

• MAPE results show that the model’s out-of-sample prediction errors are drastically reduced

in case of all activities. The reductions range from 68% (for Other) to 97% (for Education)

and are significant at a 90% level of confident, except for Other which is significant at a 75%

level.

• Regarding out-of-sample mean predictions, the mixed-profile is shown to impose

considerably less biases to duration of Shop, Recreation, Eat Meal, Maintenance, and Other.

Among these activities, lowest and highest bias reductions correspond to Recreation (9%) and

Eat Meal (67%), respectively.

• Although observations and mixed-profiles consistently show high-satiated consumption

patterns for Shop, Recreation, and Eat Meal; conventional profiles suggest low- or medium-

levels of satiation. In addition, conventional profiles fit nearly linear profiles for Work, while

the mixed-profile introduces some levels of satiation.

Due to space and scope restrictions, though, not all aspects of the problem could be explored in one

paper, calling for future studies. First, we have only focused on α and γ profiles of a basic MDCEV

formulation (which adopts the IID assumption). In fact, extent of the gap between mixed- and

conventional-profiles may be reduced by introducing scale heterogeneities and/or not-additively-

separable utility functions. Future studies may take such effects into account. Second, future studies

may take advantages of the controlled setting provided by synthesised data and run various scenario-

based analyses. Using empirical data in this paper helps us to make sure that the analyses are credible

from the viewpoints of estimation-burden and predictive-ability. Third, given the meta-heuristic,

gradient-free Coarse stage of HELPME, the proposed paradigm seems to have the potential to reduce

estimation burden of other econometric models, especially those with hard-to-evaluate gradients and

those which can easily fall into the local optima trap. Future studies may also explore HELPME

performance in estimation of such models. Finally, although we used various in-sample and out-of-

sample accuracy measures to explore extent of the predictive gap between conventional and mixed-

profiles, “how this gap could affect policies and decisions in real-life applications” is still an open

question, which may be addressed in complementary papers.

Acknowledgements

We appreciate Professor Chandra Bhat for sharing source code of the MDCEV model and its

documentations. We would also like to thank three unanimous reviewers whose comments have helped

a lot to strengthen arguments within the paper, as well as directions for future studies.

References

Atlanta Regional Commission. (2011). Atlanta Regional Travel Survey Final Report.

Armijo, Larry. “Minimization of functions having Lipschitz continuous first partial derivatives.”

Pacific Journal of mathematics 16, no. 1 (1966): 1-3.

Armstrong, J. Scott, and Fred Collopy. "Error measures for generalizing about forecasting methods:

Empirical comparisons." International journal of forecasting 8, no. 1 (1992): 69-80.

Armstrong, J. Scott. "Evaluating forecasting methods." In Principles of forecasting, pp. 443-472.

Springer US, 2001.

Bhat, Chandra R. “A multiple discrete–continuous extreme value model: formulation and application

to discretionary time-use decisions.”Transportation Research Part B: Methodological 39, no. 8

(2005): 679-707.

Bhat, Chandra R., and Sivaramakrishnan Srinivasan. “A multidimensional mixed ordered-response

model for analyzing weekend activity participation.”Transportation Research Part B:

Methodological 39, no. 3 (2005): 255-278.

Bhat, Chandra R., Sivaramakrishnan Srinivasan, and Sudeshna Sen. “A joint model for the perfect and

imperfect substitute goods case: application to activity time-use decisions.”Transportation

Research Part B: Methodological40, no. 10 (2006): 827-850.

Bhat, Chandra R. “The multiple discrete-continuous extreme value (MDCEV) model: role of utility

function parameters, identification considerations, and model extensions.”Transportation

Research Part B: Methodological 42, no. 3 (2008): 274-303.

Bhat, Chandra R., Konstadinos G. Goulias, Ram M. Pendyala, Rajesh Paleti, Raghuprasad Sidharthan,

Laura Schmitt, and Hsi-Hwa Hu. “A household-level activity pattern generation model with an

application for Southern California.”Transportation 40, no. 5 (2013): 1063-1086.

Birbil, Ş. İlker, and Shu-Chering Fang. “An electromagnetism-like mechanism for global

optimization.”Journal of global optimization 25, no. 3 (2003): 263-282.

Edwards, Yancy D., and Greg M. Allenby. “Multivariate analysis of multiple response data.”Journal of

Marketing Research 40, no. 3 (2003): 321-334.

El-Gallad, A. I., A. A. Sallam, and M. E. El-Hawary. “Swarming of intelligent particles for solving the

nonlinear constrained optimization problem.”International journal of engineering intelligent

systems for electrical engineering and communications 9, no. 3 (2001): 155-164.

Eluru, Naveen, Abdul R. Pinjari, Ram M. Pendyala, and Chandra R. Bhat. “A Unified Model System

of Activity Type Choice, Activity Duration, Activity Timing, Mode Choice, and Destination

Choice.” Working Paper, The University of Texas at Austin, Texas, 2009.

Frank, Marguerite, and Philip Wolfe. “An algorithm for quadratic programming.” Naval research

logistics quarterly 3, no. 12 (1956): 95-110.

Greene, William H. Econometric analysis. Pearson Education India, 2003.

Goodwin, Paul, and Richard Lawton. "On the asymmetry of the symmetric MAPE." International

journal of forecasting 15, no. 4 (1999): 405-408.

Hendel, Igal. “Estimating multiple-discrete choice models: An application to computerization

returns.”Review of Economic Studies (1999): 423-446.

Hyndman, Rob J., and Anne B. Koehler. "Another look at measures of forecast accuracy."

International journal of forecasting 22, no. 4 (2006): 679-688.

Kenny, D. A. (1979), “Correlation and causality,” New York: Wiley.

Kim, Jaehwan, Greg M. Allenby, and Peter E. Rossi. “Modeling consumer demand for

variety.” Marketing Science 21, no. 3 (2002): 229-250.

Koehler, A. B. "The asymmetry of the sAPE measure and other comments on the M3-competition."

International Journal of Forecasting 17, no. 4 (2001): 570-574.

LaMondia, Jeffrey, Chandra R. Bhat, and David A. Hensher. "An annual time use model for domestic

vacation travel." Journal of Choice Modelling 1, no. 1 (2008): 70-97.

Liao, Ching–Jong, Yu–Wei Kuo, Tsui–Ping Chung, and Stephen C. Shih. “Integrating production and

transportation scheduling in a two–stage supply chain.” European Journal of Industrial

Engineering 9, no. 3 (2015): 327-343.

Liu, Yu-Hsin, and Hani S. Mahmassani. “Global maximum likelihood estimation procedure for

multinomial probit (MNP) model parameters.” Transportation Research Part B:

Methodological 34, no. 5 (2000): 419-449.

Liang, Gao, Wang Xiaojuan, Wei, and Chen Yazhou. “A modified algorithm for electromagnetism-

like mechanism.”Journal of Huazhong University of Science and Technology (Nature Science

Edition) 11 (2006): 001.

Makridakis, Spyros, and Michele Hibon. "The M3-Competition: results, conclusions and implications."

International journal of forecasting 16, no. 4 (2000): 451-476.

Manchanda, Puneet, Asim Ansari, and Sunil Gupta. “The ‘shopping basket’: A model for

multicategory purchase incidence decisions.”Marketing Science18, no. 2 (1999): 95-114.

Mattson, Jeremy Wade. Travel behavior and mobility of transportation-disadvantaged populations:

Evidence from the National Household Travel Survey. No. DP-258. Upper Great Plains

Transportation Institute, 2012.

Pinjari, Abdul Rawoof, and Chandra Bhat. “A Multiple Discrete–Continuous Nested Extreme Value

(MDCNEV) model: formulation and application to non-worker activity time-use and timing

behavior on weekdays.” Transportation Research Part B: Methodological 44, no. 4 (2010): 562-

583.

Pinjari, Abdul Rawoof, and Chandra Bhat. “An efficient forecasting procedure for Kuhn-Tucker

consumer demand model systems.” Technical paper. Department of Civil & Environmental

Engineering, University of South Florida(2010).

Sivakumar, Aruna, and Chandra Bhat. “Fractional split-distribution model for statewide commodity-

flow analysis.” Transportation Research Record: Journal of the Transportation Research Board

1790 (2002): 80-88.

Sobhani, Anae, Naveen Eluru, and Ahmadreza Faghih-Imani. “A latent segmentation based multiple

discrete continuous extreme value model.” Transportation Research Part B: Methodological 58

(2013): 154-169.

Sobhani, Anae, Naveen Eluru, and Abdul R. Pinjari. “Evolution of Adults’ Weekday Time Use

Patterns from 1992 to 2010: A Canadian Perspective.” In 93rd Annual Meeting of the

Transportation Research Board (TRB), Washington, DC. 2014.

Srinivasan, Nanda, Nancy McGuckin, and Elaine Murakami. "Working retirement: Travel trends of the

aging workforce." Transportation Research Record: Journal of the Transportation Research

Board 1985 (2006): 61-70.

Train, Kenneth E. Discrete choice methods with simulation. Cambridge university press, 2009.

Vasquez Lavin, Felipe, and W. Michael Hanemann. “Functional forms in discrete/continuous choice

models with general corner solution.” Department of Agricultural & Resource Economics,

UCB (2008).

Van Nostrand, Caleb, Vijayaraghavan Sivaraman, and Abdul Rawoof Pinjari. "Analysis of long-

distance vacation travel demand in the United States: a multiple discrete–continuous choice

framework." Transportation 40, no. 1 (2013): 151-171.

Vij, Akshay, and J. Walker. "Hybrid choice models: The identification problem." PhD diss., Edward

Elgar Publishing Limited, (2014).

Wales, Terence J., and Alan Donald Woodland. “Estimation of consumer demand systems with

binding non-negativity constraints.”Journal of Econometrics 21, no. 3 (1983): 263-285.

Walker, Joan L., Moshe Ben-Akiva, and Denis Bolduc. "Identification of parameters in normal error

component logit-mixture (NECLM) models." Journal of Applied Econometrics 22, no. 6

(2007): 1095-1125.

Welch, Bernard L. “The generalization of students' problem when several different population

variances are involved.” Biometrika (1947): 28-35.

Yurtkuran, Alkın, and Erdal Emel. “A new hybrid electromagnetism-like algorithm for capacitated

vehicle routing problems.” Expert Systems with Applications 37, no. 4 (2010): 3427-3433.

Yu, Biying, Junyi Zhang, and Akimasa Fujiwara. "Representing in-home and out-of-home energy

consumption behavior in Beijing." Energy Policy 39, no. 7 (2011): 4168-4177.

Zhang, Chunjiang, Xinyu Li, Liang Gao, and Qing Wu. “An improved electromagnetism-like

mechanism algorithm for constrained optimization.” Expert Systems with Applications 40, no.

14 (2013): 5621-5634.

Appendix 1

Aggregate error metrics can be generally grouped in three categories: benchmark-based relative

measures, scale-dependent measures, and percentage-based measures. As the name suggests, measures

of the first group use errors obtained from a benchmark forecasting method (e.g. the random walk

naïve forecasts) to scale errors of the concerned method (Hyndman & Koehler 2006). Since no

benchmarking method could be found for generating activity durations, such metrics could not be used

in our study.

Among the most frequently used metrics of the second group, are Root Mean Squared Errors

(RMSE) and Mean Absolute Error (MAE). These metrics are calculated using Eq. 23 and Eq. 24,

where  and 

 denote nth observed and predicted values, and N is the total number of observations.

MAE and RMSE have two prominent disadvantages, like any other scale-dependent measure. First,

they have the same scale as the data (Hyndman & Koehler 2006). Considering this, indeed, that would

be meaningless to use them for comparing prediction results of the γ-profile model against those of the

mixed-profile model, since the two prediction sets have different scales. In addition, neither MAE nor

RMSE distinguish error percentage. For instance, the gap between 5 and 10 (i.e. a 100% error) is

treated equal to the gap between 50 and 55 (i.e. a 10% error). In case of RMSE, it has also been

extensively argued that the metric is highly sensitive to outliers and, consequently, may not be a valid

measure of accuracy (Armstrong & Collopy 1992; Armstrong 2001; and Hyndman & Koehler 2006).

Measures of the third group have neither of the deficiencies mentioned for scale-dependent

metrics. A popular metric of this category is Mean Absolute Percentage Error (MAPE), and is

calculated using Eq. 25. Primary cons of MAPE are (Hyndman & Koehler 2006): (1) it loses accuracy

in cases that some observations are close –or equal—to zero, (2) it places more weights on positive

errors compared to negative errors in some situations, and (3) it does not consider similar error

measures when forecasts and observations are interchanged. Consider the cases of ( and



) and ( and 

) for example. The so-called Symmetric MAPE (sMAPE), as

shown in Eq. 26, is introduced mainly to resolve these cons (Hyndman & Koehler 2006). As Hyndman

& Koehler (2006) argued, though, it is not guaranteed that sMAPE is significantly less sensitive to

existence of very-small observations. Furthermore, as Goodwin & Lawton (1999) show, sMAPE puts

different penalties for positive and negative errors in some situations while MAPE does not.

Particularly, the authors state: “it [sMAPE] in fact creates a new problem of asymmetry which is more

likely to be of practical concern than the problem resulting from the interchange. Indeed, the

conventional APE [that is, MAPE before aggregation] does not treat single errors above the actual

value any differently from those below it. If the actual value is 100 units, errors of -10 and +10 units

both result in an APE of 10%. The modiﬁed APE [that is, sMAPE before aggregation] does treat them

differently. For example, the errors of -10 and +10 units, given above, would result in modiﬁed APEs

of 18.18% and 22.2%, respectively.” More importantly, level of asymmetricity in sMAPE is argued to

notably depend on the magnitude of percentage errors (Goodwin & Lawton 1999; Koehler 2001; and

Hyndman & Koehler 2006). For instance, Goodwin & Lawton (1999) show that “when the forecast

error is +100% the modiﬁed APE is three times higher than when the error is -100%.” Observing

notably large percentage errors associated with the γ-profile model in our study (as outlined in the

second column of Table 5), we decided to avoid sMAPE and stick with MAPE. Furthermore,

following M3-Competition (Makridakis & Hibon 2000), we also excluded observations which are

literally zero from MAPE calculations to avoid infinity values.

( )

RMSE

nnn

=−

(23)

MAE

nnn

=−

(24)

OPO

MAPE

nnnn

=−

=1100

(25)

POPO

sMAPE

nnnn

=+−

=1)(200

(26)

Dynamics of travelers’ modality style in the presence of mobility-on-demand services

Article

Full-text available

Aug 2020
TRANSPORT RES C-EMER

Disability Effects on Daily Activity Type and Duration

Article

Dec 2020

Comparative Study of Logit and Weibit Model in Travel Mode Choice

Article

Full-text available

Mar 2020

To quantify travel demand, it is necessary to understand the travelers’ mode choice behavior. The Logit model is widely used in travel mode choice because of its closed form. Nevertheless, the variance of the utility function is unchanged in Logit-based models, indicating that the perceived error of the traveler on the option is fixed as the utility changes, which is inconsistent with the actual situation. While in Weibit-based models, travelers’ perception error of options grows with the increase of the utility. Moreover, the relative difference is captured, and the asymmetric property exists, which is different from Logit-based models. This paper contributes to the literature by comparing the performances of the Logit-based and Weibit-based models. In this article, six discrete choice models for travel mode choice are discussed based on data of Swiss metro, which includes multinomial Logit model, multinomial Weibit model, and derived models. The Weibit-based models outperform the Logit-based models, considering with the adjusted likelihood ratio index of all models in this paper.

Application of MDCEV to infrastructure planning in regional freight transport

Article

Mar 2020
TRANSPORT RES A-POL

The main objective of the paper is to develop a model capable of evaluating the societal impact of rail infrastructure investment in Argentina, using a Multiple Discrete Extreme Value Model (MDCEV) estimated on Stated and Revealed preference data. The decision modelled is the mode and port choice at a planning level, where multiple alternatives can be chosen simultaneously. The relevant variables were the Free Alongside Ship (FAS) price, freight transport cost, travel time and lead time, including non-observed heterogeneity in the modelling. As a consequence, the willingness to pay measures that are used for the cost benefit analysis become non-deterministic. To include this effect simulated WTP measurements were included and compared to a deterministic and risk based approach. Two projects were tested and both showed that the deterministic approach gives higher Benefit/Cost ratio. This paper raises the concern that if non-observed heterogeneity is not considered in project evaluation it may provide misleading results and potentially lead to wrong investment priorities for the public sector.

Analyzing the impact of neighborhood safety on active school travels

Article

Jun 2019

Childhood obesity has become a serious public health challenge during the past few decades, calling for policies to incorporate physical activity into students’ routines. This study is an effort to contribute to the current literature of school travels by analyzing how improving safety of different neighborhoods in Chicago, Illinois would encourage students toward shifting to active modes, and how this interrelationship is affected by the severe weather conditions during the cold winters of the region. The results are complemented by multiple sensitivity analyses to quantify how these shifts would help different students burn extra walking calories (i.e. the extra calories each student burns due to walking more). We estimate multiple discrete continuous extreme value models to understand how flexible a student is in combining his/her most preferred transportation mode with other choices. Various sources of inter-personal heterogeneity are also captured by using a latent-classification framework as well as differentiating the before-school from after-school trip chains to consider the behavioral distinction, explicitly. Several explanatory variables are incorporated into the models, including socio-demographics of students and their household, land-use, crime prevalence, and seasonal/weather conditions. Per the results, improving safety of Chicago from its current condition to the national median, would encourage students to be up to 40% more active. This extra active travel demand would provide obese students aged 14–18 with 18% of the calorie burn they need to lose weight to the obesity cutoff and 13% of the calorie burn required for losing weight from the obesity cutoff to overweight.

Exploring elderly people's daily time-use patterns in the living environment of Beijing, China

Article

Oct 2022
CITIES

To develop an age-friendly society, it is a basic task to deeply understand what and how factors would affect elderly people's daily multiple-activity time-use patterns. In the literature, the impacts of built environments as the physical aspects of living environments have been widely studied. However, the impacts of social environments as the social aspects of living environments on multiple activity participation and duration constrained by time budgets have not been found. To fill this gap, this paper makes efforts to propose various social environmental indicators and incorporate them into a multiple discrete-continuous extreme value (MDCEV) model to test the significance of each variable and to explore the impacts of living environments on elderly people's daily multiple-activity time-use behaviours. In addition to socio-economic compositions, the social environmental indicators are extended to social ties of people, living pressure, social resources distribution, social safety and policy. Then a joint elasticity analysis for multiple activity participation and duration with constraints is proposed for deriving policy and practice implications. The specifications and joint elasticity analysis in Beijing of China which is a representative city affected by Confucian culture and faced with aging problem in Eastern and South-eastern Asia help find out significant factors and their impacts on elderly people's multiple-activity time-use patterns. Policy and practice implications are suggested to improve elderly people's out-of-home activity participation and duration so that an age-friendly society is expected.

Application of machine learning with a surrogate model to explore seniors’ daily activity patterns

Article

Aug 2021

Yiling Deng

Investigating seniors’ daily activity patterns (DAPs) is essential to understand their activity-travel needs. Although some studies have applied machine learning (ML) to derive DAPs, few of them have sought to improve the interpretability of ML. This study aims to predict and interpret seniors’ DAPs in the Chinese context by using ML with a surrogate model. First, a boosted C5.0 algorithm was employed to model seniors’ DAPs, which provided more accurate predictions than the multinomial logit (MNL) model. Second, a rule-based C5.0 algorithm was used as a surrogate model to approximate the prediction function of the boosted C5.0 algorithm and to provide insight into the underlying decision processes in the boosted C5.0 algorithm. The results show that retired men are most likely to lack out-of-home activities. A good residential built environment, especially good walkability and public transit accessibility, increases seniors’ out-of-home activities. This study provides recommendations for increasing seniors’ mobility.

An empirical model for Indian senior citizens in traffic management

Article

Mar 2020

The paper explores the outcomes of an analysis on day by day task‐journey planning conduct of senior citizens by utilizing a modern dynamic model and a family unit travel overview, gathered in Bhubaneswar, Odisha, of India in 2018. The task journey planning display assumes a unique time–space‐constrained planning development. The main commitment of the article is to reveal day by day task‐journey planning conduct through a comprehensive dynamic framework. Numerous behavioural subtleties are revealed by the subsequent empirical model. These incorporate the role that income plays in directing outside time consumption decisions of senior citizens. Senior citizens in the most elevated and least salary classes will, in general, have minor varieties in time consumption decisions than those in middle pay classifications. Generally speaking, the time consumption decisions become progressively steady with expanding age, demonstrating that more task durations and lower task recurrence become progressively predominant with increasing age. Day by day task type and area decisions reveal a reasonable irregular utility‐amplifying level headed conduct of senior residents. Unmistakably expanding spatial availability to different task areas is an urgent factor in characterizing every day outside task interest of senior residents. It is likewise evident that the assorted variety of outside task type decisions decreases with the increase in age and senior citizens are more sensitive to auto journey hour than to travel or non‐mechanised journey hour.

Effect of driving domain on driving attitude profiles in winter seasons of India

Article

Feb 2020

The trademark of motor vehicle use and driving attitude profiles explores new safety upgrades and energy redemption because of the role of driving attitude profiles and driving domains on road mishaps and traffic intensities. Thus, the goal of this research was to identify driving attitude profiles for multiple driving domains (on the basis of speed limits and winter conditions) from real‐world driving information and to investigate how these driving environments have an impact on driving attitude profiles. The study for this research was located in Bhubaneswar, India, where driving information from 64 motorists was collected with Fog Monitor (FM‐120) for at least 12 months. The outcomes indicate that both speed limits and winter states influence driving attitude profiles significantly. Specifically, for winter states, the outcomes provide authentication that motorists drive more quietly (normal speed is 21% lower for zero visibility (visibility < 100 m) than under haze conditions (visibility: 2–5 km), while positive and negative accelerations decrease by 7% and 9%, respectively), when deliberating the effect of speed limit more local streets (limits II, III, and IV streets) are the ones that present more reckless driving profiles in terms of acceleration (30%–40% increase from limit I to limit IV streets). The contribution of this study regards the quantification of the impacts of driving domains on driving attitude profiles, providing evidence that winter conditions significantly affect driving attitude profiles, leading motorists to adjust their driving attitude profiles to the driving domains. However, regarding the speed limit, the differences found in driving attitude profiles seem to be more a consequence of the infrastructure characteristics than an adjustment of driving attitude profiles.

Week-Long Mode Choice Behavior: Dynamic Random Effects Logit Model

Article

Full-text available

Jun 2019

Modeling travelers’ mode choice behavior is an important component of travel demand studies. In an effort to account for day-to-day dynamics of travelers’ mode choice behavior, the current study develops a dynamic random effects logit model to endogenously incorporate the mode chosen for a day into the utility function of the mode chosen for the following day. A static multinomial logit model is also estimated to examine the performance of the dynamic model. Per the results, the dynamic random effects model outperforms the static model in relation to predictive power. According to the accuracy indices, the dynamic random effects model offers the predictive power of 60.0% for members of car-deficient households, whereas the static model is limited to 43.1%. Also, comparison of F1-scores indicates that the predictive power of the dynamic random effects model with respect to active travels is 47.1% whereas that of the static model is as low as 15.0%. The results indicate a significant day-to-day dynamic behavior of transit users and active travelers. This pattern is found to be true in general, but not for members of car-deficient households, who are found more likely to choose the same mode for two successive days.

Hybrid choice models: The identification problem

Chapter

Full-text available

Aug 2014

Multivariate Analysis of Multiple Response Data

Article

Full-text available

Aug 2003

Multiple response questions, also known as a pick any/J format, are frequently encountered in the analysis of survey data. The relationship among the responses is difficult to explore when the number of response options, J, is large. The authors propose a multivariate binomial probit model for analyzing multiple response data and use standard multivariate analysis techniques to conduct exploratory analysis on the latent multivariate normal distribution. A challenge of estimating the probit model is addressing identifying restrictions that lead to the covariance matrix specified with unit-diagonal elements (i.e., a correlation matrix). The authors propose a general approach to handling identifying restrictions and develop specific algorithms for the multivariate binomial probit model. The estimation algorithm is efficient and can easily accommodate many response options that are frequently encountered in the analysis of marketing data. The authors illustrate multivariate analysis of multiple response data in three applications.

Working Retirement: Travel Trends of the Aging Workforce

Article

Jan 2006

The proportion and number of older workers (those older than 65) are expected to increase significantly in the coming decades, and examining this cohort's travel behavior may provide insight into this potential boom. This study is an exploratory analysis to describe working patterns of the older population today, to examine their work trips, and to make some guesses about how the baby boom generation will be similar to or differ from today's older population. With the available literature on travel by the elderly and data from the 2000 U.S. decennial census and the 2001 National Household Travel Survey, the commute and occupational characteristics of older workers in the work force are explored. Topics covered include projected increase in miles driven by older population groups, trends in labor force participation, occupations of older workers, overall travel patterns, travel time, and mode-to-work characteristics; examination of race and ethnic origin of older workers; and description of the older work-at-home population.

The generalization of Student's problem when several different population variances are involved

Article

Jan 1947
BIOMETRIKA

B.L. Welch

Modified algorithm for electromagnetism-like mechanism

Article

Nov 2006

Electromagnetism-like mechanism (EM) is a population-based stochastic global optimization algorithm. A modified EM is proposed. According to the optimizing mechanism of EM, the computation equation for forces exerted on the charged particles was modified and a moving coefficient was introduced into the equation. The experimental results demonstrate that the modified algorithm accelerates the convergence speed and improves the precision of the solutions, especially in optimizing the functions with high dimensions, showing that the functions optimization with high dimensions can be solved by this algorithm well.

Integrating production and transportation scheduling in a two-stage supply chain

Article

Jan 2015

This paper is concerned with the research on coordinated scheduling of production and transportation in a two-stage supply chain. The first stage involves scheduling multiple suppliers with various production speeds while the second stage deals with the scheduling job of several vehicles under the assumption that each vehicle is travelling at a different speed and has different transport capacity. The primary objective of this research is to minimise the maximum completion time for all jobs. With that, a new heuristic is proposed and an electromagnetism-like mechanism (EM) algorithm is developed for searching a near optimal or optimal solution. Computational results show that the proposed EM algorithm embedded with the heuristic significantly is better than an existing gendered genetic algorithm, with an average improvement of 20.66% while using much less computation time.

A latent segmentation based multiple discrete continuous extreme value model

Article

Dec 2013
TRANSPORT RES B-METH

We examine an alternative method to incorporate potential presence of population heterogeneity within the Multiple Discrete Continuous Extreme Value (MDCEV) model structure. Towards this end, an endogenous segmentation approach is proposed that allocates decision makers probabilistically to various segments as a function of exogenous variables. Within each endogenously determined segment, a segment specific MDCEV model is estimated. This approach provides insights on the various population segments present while evaluating distinct choice regimes for each of these segments. The segmentation approach addresses two concerns: (1) ensures that the parameters are estimated employing the full sample for each segment while using all the population records for model estimation, and (2) provides valuable insights on how the exogenous variables affect segmentation. An Expectation–Maximization algorithm is proposed to address the challenges of estimating the resulting endogenous segmentation based econometric model. A prediction procedure to employ the estimated latent MDCEV models for forecasting is also developed. The proposed model is estimated using data from 2009 National Household Travel Survey (NHTS) for the New York region. The results of the model estimates and prediction exercises illustrate the benefits of employing an endogenous segmentation based MDCEV model. The challenges associated with the estimation of latent MDCEV models are also documented.

A household-level activity pattern generation model with an application for Southern California

Article

Sep 2013

This paper develops and estimates a multiple discrete continuous extreme value model of household activity generation that jointly predicts the activity participation decisions of all individuals in a household by activity purpose and the precise combination of individuals participating. The model is estimated on a sample obtained from the post census regional household travel survey conducted by the South California Association of Governments in the year 2000. A host of household, individual, and residential neighborhood accessibility measures are used as explanatory variables. The results reveal that, in addition to household and individual demographics, the built environment of the home zone also impacts the activity participation levels and durations of households. A validation exercise is undertaken to evaluate the ability of the proposed model to predict participation levels and durations. In addition to providing richness in behavioral detail, the model can be easily embedded in an activity-based microsimulation framework and is computationally efficient as it obviates the need for several hierarchical sub-models typically used in extant activity-based systems to generate activity patterns.

Analysis of Long-Distance Vacation Travel Demand in the United States: A Multiple Discrete-Continuous Choice Framework

Article

Apr 2012

This study analyzes the annual vacation destination choices and related time allocation patterns of American households. More specifically, an annual vacation destination choice and time allocation model is formulated to simultaneously predict the different vacation destinations that a household visits in a year, and the time (no. of days) it allocates to each of the visited destinations. The model takes the form of a multiple discrete–continuous extreme value (MDCEV) structure. Further, a variant of the MDCEV model is proposed to reduce the prediction of unrealistically small amounts of vacation time allocation to the chosen destinations. To do so, the continuously non-linear utility functional form in the MDCEV framework is replaced with a combination of a linear and non-linear form. The empirical analysis was performed using the 1995 American Travel Survey data, with the United States divided into 210 alternative destinations. The model estimation results provide several insights into the determinants of households’ vacation destination choice and time allocation patterns. Results suggest that travel times and travel costs to the destinations, and lodging costs, leisure activity opportunities (measured by employment in the leisure industry), length of coastline, and weather conditions at the destinations influence households’ destination choices for vacations. The annual vacation destination choice model developed in this study can be incorporated into a larger national travel modeling framework for predicting the national-level, origin–destination flows for vacation travel.

An improved electromagnetism-like mechanism algorithm for constrained optimization

Article

Oct 2013
EXPERT SYST APPL

Many problems in scientific research and engineering applications can be decomposed into the constrained optimization problems. Most of them are the nonlinear programming problems which are very hard to be solved by the traditional methods. In this paper, an electromagnetism-like mechanism (EM) algorithm, which is a meta-heuristic algorithm, has been improved for these problems. Firstly, some modifications are made for improving the performance of EM algorithm. The process of calculating the total force is simplified and an improved total force formula is adopted to accelerate the searching for optimal solution. In order to improve the accuracy of EM algorithm, a parameter called as move probability is introduced into the move formula where an elitist strategy is also adopted. And then, to handle the constraints, the feasibility and dominance rules are introduced and the corresponding charge formula is used for biasing feasible solutions over infeasible ones. Finally, 13 classical functions, three engineering design problems and 22 benchmark functions in CEC’06 are tested to illustrate the performance of proposed algorithm. Numerical results show that, compared with other versions of EM algorithm and other state-of-art algorithms, the improved EM algorithm has the advantage of higher accuracy and efficiency for constrained optimization problems.

Estimating a mixed-profile MDCEV: case of daily activity type and duration

Abstract and Figures

Recommended publications

A STUDY OF ENHANCING SECURITY OF SENSITIVE STATISTICAL INFORMATION USING HYBRID PARADIGMS

Leader-Based Consensus

Estimation of channel characteristics of narrow bipolar events based on the transmission-line model

Assessment on Using Multitaper and Higher-Order STBC Techniques for Spectrum Estimation in Cognitive...