ArticlePDF Available

Multinomial Logit Models with Continuous and Discrete Individual Heterogeneity in R : The gmnl Package

January 2016
Journal of Statistical Software 79(2)

January 2016
79(2)

DOI:10.18637/jss.v079.i02

License
CC BY 4.0

Authors:

Mauricio Sarrias

Universidad de Talca

Ricardo A. Daziano

Cornell University

This paper introduces the package gmnl in R for estimation of multinomial logit models with unobserved heterogeneity across individuals for cross-sectional and panel (longitudinal) data. Unobserved heterogeneity is modeled by allowing the parameters to vary randomly over individuals according to a continuous, discrete, or mixture distribution, which must be chosen a priori by the researcher. In particular, the models supported by gmnl are the multinomial or conditional logit, the mixed multinomial logit, the scale heterogeneity multinomial logit, the generalized multinomial logit, the latent class logit, and the mixed-mixed multinomial logit. These models are estimated using either the Maximum Likelihood Estimator or the Maximum Simulated Likelihood Estimator. This article describes and illustrates with real databases all functionalities of gmnl, including the derivation of individual conditional estimates of both the random parameters and willingness-to-pay measures.

Kernel density of the individuals' conditional mean.

…

displays the distribution of the individuals' conditional mean for the parameter of loc. The gray area gives us the proportion of individuals with a positive conditional mean.

…

95% confident interval for the conditional means.

…

Figures - uploaded by Mauricio Sarrias

Content may be subject to copyright.

Content uploaded by Mauricio Sarrias

Content may be subject to copyright.

Multinomial Logit Models with Continuous and

Discrete Individual Heterogeneity in R: The gmnl

Package

Mauricio Sarrias

Universidad Cat´olica del Norte

Ricardo A. Daziano

Cornell University

Abstract

This paper introduces the package gmnl in Rfor estimation of multinomial logit models

with unobserved heterogeneity across individuals for cross-sectional and panel (longitudi-

nal) data. Unobserved heterogeneity is modeled by allowing the parameters to vary ran-

domly over individuals according to a continuous, discrete, or discrete-continuous mixture

distribution, which must be chosen a priori by the researcher. In particular, the models

supported by gmnl are the multinomial or conditional logit, the mixed multinomial logit,

the scale heterogeneity multinomial logit, the generalized multinomial logit, the latent

class logit, and the mixed-mixed multinomial logit. These models are estimated using

either the Maximum Likelihood Estimator or the Maximum Simulated Likelihood Es-

timator. This article describes and illustrates with real databases all functionalities of

gmnl, including the derivation of individual conditional estimates of both the random

parameters and willingness-to-pay measures.

Keywords: latent class, mixed multinomial logit, random parameters, preference heterogeneity,

1. Introduction

Modeling individual choices has been a very important avenue of research in diverse ﬁelds

such as marketing, transportation, political science, and environmental, health, and urban

economics. In all these areas the most widely used method to model choice among mutually

exclusive alternatives has been the Conditional or Multinomial Logit model (MNL) (McFad-

den 1974), which belongs to the family of Random Utility Maximization (RUM) models. The

main advantage of the MNL model has been its simplicity in terms of both estimation and

interpretation of the resulting choice probabilities and elasticities. On the one hand, the MNL

has a closed-form choice probability and a likelihood function that is globally concave (for

a complete overview of MNL and its properties, see Train 2009). MNL estimation is thus

straightforward using the Maximum Likelihood Estimator (MLE). On the other hand, it has

been recognized that MNL not only imposes constant competition across alternatives — as

a consequence of the independence of irrelevant alternatives (IIA) property — but also lacks

the ﬂexibility to allow for individual-speciﬁc preferences.

With the advent of more powerful computers and the improvement of simulation-aided infer-

ence in the last decades, researchers are no longer constrained to use models with closed-form

2gmnl Package in R

solutions that may lead to unrealistic behavioral speciﬁcations. In fact, much of recent work

in choice modeling focuses on extending MNL to allow for random-parameter models that

accommodate unobserved preference heterogeneity.

The most popular MNL extension is the Mixed Logit Model (MIXL). MIXL allows parameters

to vary randomly over individuals by assuming some continuous heterogeneity distribution a

priori while keeping the MNL assumption that the error term is independent and identically

distributed (i.i.d) extreme value type 1 (McFadden and Train 2000;Train 2009;Hensher and

Greene 2003). MIXL is a very ﬂexible model (MIXL properties are discussed in McFadden and

Train 2000) that does not exhibit pure IIA substitution (cf. Dotson, Brazell, Howell, Lenk,

Otter, MacEachern, and Allenby 2015). Furthermore, using the parametric heterogeneity

distribution that describes how preferences vary in the population it is possible to derive

conditional estimates of the parameters at the individual-level (Train 2009).

Below we present a brief example to introduce the idea of unobserved heterogeneity across

individuals, and how gmnl works. This example uses microdata about individual choice

among four transportation modes: air, train, bus and car. Both the data itself and its correct

formatting — which follows the mlogit data frame— as well as the gmnl syntax are explained

further in the following sections.

R> library("gmnl")

R> data("TravelMode", package = "AER")

R> library("mlogit")

R> TM <- mlogit.data(TravelMode, choice = "choice", shape = "long",

+ alt.levels = c("air", "train", "bus", "car"))

R> mixl <- gmnl(choice ~ vcost + travel + wait | 1 ,

+ data = TM,

+ model = "mixl",

+ ranp = c(travel = "n"),

+ R = 50)

Estimating MIXL model

In the estimated model, choice is a discrete (multinomial response) dependent variable that

indicates which of the four alternative modes was actually chosen by each individual. There

are three explanatory variables that have values that are alternative-speciﬁc (vcost: in-vehicle

cost, travel: in-vehicle time, and wait: waiting time). The model also includes alternative-

speciﬁc constants (| 1). Random parameters are considered just for the in-vehicle time

component, making this example an application of a Mixed Logit Model (model = "mixl").

The ‘marginal disutility’ of travel time1is assumed to be normally distributed in the line of

code that identiﬁes the random parameters (ranp = c(travel = "n")). The assumption of

normality is an example of representing variation in preferences according to a parametric

continuous distribution. All other parameters (the alternative-speciﬁc constants, wait and

vcost) are assumed ﬁxed. Finally, R = 50 evaluation points are used to simulate the likelihood

function.

1In a discrete choice model, the parameters of interest are interpreted as marginal utilities, as choice is

modeled as selecting the alternative that maximizes utility, which is a function of the explanatory variables.

Some attributes cause a reduction in utility: travelers desire to reduce travel time, and hence its associated

parameter is expected to be negative.

Mauricio Sarrias, Ricardo Daziano 3

R> summary(mixl)

Model estimated on: Wed Mar 02 10:35:12 2016

Call:

gmnl(formula = choice ~ vcost + travel + wait | 1, data = TM,

model = "mixl", ranp = c(travel = "n"), R = 50, method = "bfgs")

Frequencies of categories:

air train bus car

0.276 0.300 0.143 0.281

The estimation took: 0h:0m:9s

Coefficients:

Estimate Std. Error z-value Pr(>|z|)

train:(intercept) 0.36131 1.01014 0.36 0.72058

bus:(intercept) -0.44685 1.03936 -0.43 0.66725

car:(intercept) -4.90453 1.09319 -4.49 7.2e-06 ***

vcost -0.02611 0.00954 -2.74 0.00621 **

wait -0.11642 0.01484 -7.84 4.4e-15 ***

travel -0.00808 0.00236 -3.42 0.00062 ***

sd.travel 0.00531 0.00204 2.60 0.00927 **

---

Signif. codes: 0 '***'0.001 '**'0.01 '*'0.05 '.'0.1 ' ' 1

Optimization of log-likelihood by BFGS maximization

Log Likelihood: -189

Number of observations: 210

Number of iterations: 156

Exit of MLE: successful convergence

Simulation based on 50 draws

The output of calling the gmnl function includes the point estimates, standard errors, z-value,

and p-value of the parameters of each explanatory variable. Note that for the normally

distributed parameter, both the population mean and standard deviation are estimated:

travel=-0.008 is the point estimate of the mean and sd.travel=0.005 is the point esti-

mate of the standard deviation of the normal distribution that represents how preferences for

in-vehicle travel time reductions are distributed in the population. Note that, we can infer

the proﬁle of individuals’ preferences for travel time by looking at the shape of the parameter,

which is distributed in this sample as N(−0.008,0.0052).

In addition to MIXL, there are other models that can represent unobserved preference hetero-

geneity. Latent Class (LC) discrete choice models oﬀer an alternative to MIXL by replacing

the continuous distribution assumption with a discrete distribution in which preference het-

erogeneity is captured by membership in distinct classes or segments (Boxall and Adamowicz

2002;Greene and Hensher 2003;Shen 2009). The standard LC speciﬁcation with class-speciﬁc

4gmnl Package in R

multinomial logit models of choice (LC-MNL) is useful if the assumption of preference homo-

geneity holds within segments. In eﬀect, in an LC-MNL all individuals in a given class have

the same parameters (ﬁxed parameters within a class), but the parameters vary across classes

(heterogeneity across classes).

Rossi, Allenby, and McCulloch (2005) in a Bayesian setting, and more recently Bujosa, Riera,

and Hicks (2010) and Greene and Hensher (2013) in the frequentist context, have derived

models with discrete-continuous mixing distributions of unobserved heterogeneity in the form

of a (ﬁnite) mixture of normals. This model is also known as Mixed-Mixed Logit (MM-MNL)

(Keane and Wasi 2013).

Other researchers have focused on MNL extensions that allow for a more ﬂexible representa-

tion of heteroskedasticity. For example, Fiebig, Keane, Louviere, and Wasi (2010) proposed

two new models, namely the Scale Heterogeneity (S-MNL) model and the Generalized Multi-

nomial Logit (G-MNL) model. S-MNL extends the MNL by letting the scale of errors vary

across individuals (via a parametric speciﬁcation of heteroskedasticity), whereas the G-MNL

nests the S-MNL, MIXL, and MNL models. For a discussion of confounding eﬀects between

scale and preference heterogeneity, see Hess and Rose (2012) and Hess and Stathopoulos

(2013).

In addition to gmnl, there exist diﬀerent packages in R(RCore Team 2015) that estimate

models with multinomial responses. Some packages for estimating Multinomial Logit models

with ﬁxed parameters are mlogit (Croissant 2012), RSGHB (Dumont, Keller, and Carpenter

2014), mnlogit (Hasan, Zhiyu, and Mahani 2015), the function multinom function from the

nnet package (Venables and Ripley 2002), VGAM (Yee 2010), and bayesm (Rossi. 2012). The

Multinomial Probit (MNP) model is ﬁtted in bayesm,MNP (Imai and Dyk 2005) and mlogit.

Models with random parameters are supported by mlogit,mclogit2,bayesm,ChoiceModelR3

and RSGHB. In terms of models with latent classes RSGHB,ﬂexmix (Leisch 2004), and

poLCA (Linzer and Lewis 2011) oﬀer alternative estimation procedures. gmnl is the only

package to date that handles G-MNL speciﬁcations, and the only package that implements

the maximum simulated likelihood estimator for MM-MNL. MM-MNL models are also im-

plemented using a Bayes estimator in bayesm. However, gmnl allows the researcher to specify

covariates that explain assignment to classes, whereas bayesm assumes constant weights for

the continuous components of the double mixture.4

Among all the packages mentioned above, mlogit is one of the most complete and user-friendly

Rpackages for the estimation of models with multinomial responses. For this reason, we have

adopted most of the mlogit syntax in gmnl. Table 1presents a more complete overview of

the models supported by each package and the estimation procedure used to estimate the

parameters.

The gmnl package (Sarrias and Daziano 2015) is intended to consolidate in a single Rpackage

the whole range of discrete choice models with random parameters for the use of researchers

and practitioners. It shares similar functionalities with mlogit and mnlogit in terms of the

formula interface. Furthermore, since gmnl is able to estimate G-MNL models, it also allows

the user to estimate models in willingness-to-pay space with a minimal extra reformulation.

2Only random intercepts are allowed in the current version.

3Both bayesm and ChoiceModelR only allow normally distributed parameters.

4One of the advantages of the Bayes estimator and its implementation in bayesm is the possibility of letting

the number of discrete components of the double mixture free. This is achieved by assuming a Dirichlet Process

prior for the heterogeneity distribution.

Mauricio Sarrias, Ricardo Daziano 5

Model Package Estimation procedure

MNL

gmnl Maximum likelihood

mclogit Maximum likelihood

mlogit Maximum likelihood

MNL Maximum likelihood

mnlogit Maximum likelihood

multinom function (nnet) Maximum likelihood

VGAM Maximum likelihood

bayesm Bayesian inference

MCMCpack Bayesian inference

RSGHB Bayesian inference

MNP mlogit Maximum simulated likelihood

bayesm Bayesian inference

MNP Bayesian inference

MIXL

gmnl Maximum simulated likelihood

mlogit Maximum simulated likelihood

mclogit Penalized quasi-likelihood

bayesm Bayesian inference

ChoiceModelR Bayesian inference

RSGHB Bayesian inference

G-MNL gmnl Maximum simulated likelihood

S-MNL gmnl Maximum simulated likelihood

LC-MNL

gmnl Maximum likelihood

ﬂexmix Expectation-Maximization

poLCA Expectation-Maximization

RSGHB Bayesian inference

MM-MNL gmnl Maximum simulated likelihood

bayesm Bayesian inference

Table 1: Packages available in Rfor models with multinomial response.

Our package also provides the ability of constructing the conditional estimates for the individ-

ual parameters and willingness-to-pay. gmnl is available from the Comprehensive RArchive

Network (CRAN) at http://cran.r-project.org/package=gmnl.

The paper is organized as follows: Section 2presents a brief overview of the models supported

by gmnl. Section 3discusses the functionalities of the package. Section 4explains some com-

putational issues that may arise when estimating random parameter models using maximum

simulated likelihood. Finally, Section 5concludes the paper.

2. Models

2.1. Mixed and latent class logit models

MIXL generalizes the MNL model by allowing the preference or taste parameters to be diﬀer-

6gmnl Package in R

ent for each individual (McFadden and Train 2000;Train 2009). MIXL is basically a random

parameter logit model with continuous heterogeneity distributions. The random utility of

person ifor alternative jand for choice occasion tis:

Uijt =x>

ijtβi+ijt i= 1, ..., N ;j= 1, ..., J, t = 1, ..., Ti,(1)

where xijt is a K×1 vector of observed alternative attributes; ijt is the idiosyncratic error

term or taste shock, and is i.i.d. extreme value type 1; the parameter vector βiis unobserved

for each iand is assumed to vary in the population following the continuous density f(βi|θ),

where θare the parameters of this distribution. This mixing distribution can in principle take

any shape. For example, when assuming that the parameters are distributed multivariate

normal, βi∼MVN(β,Σ), the vector βican be re-written as:

βi=β+Lηi,

where ηi∼N(0,I), and Lis the lower-triangular Cholesky factor of Σsuch that LL>=

VAR(βi) = Σ. If the oﬀ-diagonal elements of Lare zero, then the parameters are indepen-

dently normally distributed. Observed heterogeneity (deterministic taste variations) can also

be accommodated in the random parameters by including individual-speciﬁc covariates (see

for example Greene 2012). Speciﬁcally, the vector of random coeﬃcients is:

βi=β+Πzi+Lηi,(2)

where ziis a set of Mcharacteristics of individual ithat inﬂuence the mean of the preference

parameters; and Πis a K×Mis a matrix of additional parameters.

Unlike the MIXL model, LC uses a discrete mixing distribution, where individual ibelongs

to class qwith probability wiq, i.e.,:

βi=βqwith probability wiq for q= 1, ..., Q,

where Pqwiq = 1 and wiq >0. The discrete mixing distribution (or class assignment prob-

ability) is unknown to the analyst. The most widely used formulation for wiq is the semi-

parametric Multinomial Logit format (Greene and Hensher 2003;Shen 2009):

wiq =exp h>

iγq

q=1 exp h>

iγq;q= 1, ..., Q, γ1=0,

where hidenotes a set of socio-economic characteristics that determine assignment to classes.

The parameters of the ﬁrst class are normalized to zero for identiﬁcation of the model. Note

that one could omit any socio-economic covariate as a determinant of the class assignment

probability. Under this scenario, the class probabilities simply become:

wiq =exp (γq)

q=1 exp (γq);q= 1, ..., Q, γ1= 0,

where γqis a constant (Scarpa and Thiene 2005).

Mauricio Sarrias, Ricardo Daziano 7

Let yijt = 1 if individual ichooses jon occasion t, and 0 otherwise. Then, the unconditional

probabilities of the sequence of choices by individual ifor MIXL and LC are respectively given

by:

Pi(θ) = Z





j

exp x>

ijtβi

j=1 exp x>

ijtβi



yijt 



f(βi)dβi

Pi(θ) =

wiq 





j

exp x>

ijtβq

j=1 exp x>

ijtβq



yijt 



.

Both MIXL and LC are widely used in practice to accommodate preference heterogeneity

across respondents. As discussed above, in the MIXL approach parameters are assumed to

vary across the population according to some prespeciﬁed statistical distribution that con-

tinuously represents preferences. In the LC model a discrete number of separate classes or

segments, each with diﬀerent ﬁxed parameters, recover preference heterogeneity. In addition

to diﬀerentiation in terms of continuous versus discrete consumer segments, there exist fur-

ther diﬀerences between MIXL and LC. For example, compared with the MIXL approach,

the LC model has the advantage of being “relatively simple, reasonably plausible and statis-

tically testable” (Shen 2009). In addition, because LC is a semiparametric speciﬁcation that

depends only on the prespeciﬁed number of classes, it avoids misspeciﬁcation problems in the

distribution of individual heterogeneity. In fact, the main disadvantage of MIXL is that the

researcher has to choose the distribution of the random parameters a priori. Nevertheless,

LC is less ﬂexible than MIXL precisely because the parameters in each class are ﬁxed. An-

other important diﬀerence between these two models is the estimation procedure. The MIXL

requires the use of the maximum simulated likelihood estimator – which can be very costly

in terms of computational time – but no simulation is required for LC.5gmnl implements the

Maximum Likelihood Estimator for both LC and MIXL, using analytical expressions for the

appropriate gradient.

2.2. Mixed-Mixed Logit model

To take advantage of the beneﬁts of both MIXL and LC, recent empirical papers have derived

a mixture of both models. This double-mixture is known as the ‘Mixed-Mixed’ Logit model

(MM-MNL) (Keane and Wasi 2013).6Bujosa et al. (2010), and Greene and Hensher (2013)

developed this MM-MNL model by extending the LC model to allow for random parameters

within each class.

Consider the case where the heterogeneity distribution is generalized to a discrete mixture of

multivariate normal distributions. In this case we have:

βi∼N(βq,Σq) with probability wiq for q= 1, ..., Q. (3)

5For an empirical comparison between these two models, see for example Greene and Hensher (2003), Shen

(2009) and Hess, Ben-Akiva, Gopinath, and Walker (2011).

6Train (2008) refers to this model as ‘discrete mixture of continuous distributions’, whereas Greene and

Hensher (2013) label it ‘LC-MIXL’.

8gmnl Package in R

The appeal of using a Gaussian mixture for the heterogeneity distribution is that any contin-

uous distribution can be approximated by a discrete mixture of normal distributions (Train

2008). Note that the MM-MNL with only one class is equivalent to the MIXL model. Fur-

thermore, if Σq→0 for all q, the model in Equation 3becomes a LC-MNL model (Bujosa

et al. 2010;Keane and Wasi 2013). Thus, MM-MNL nests both MIXL and LC.

The choice probabilities for the MM-MNL are given by:

Pi(θ) =

wiq Z





j

exp x>

ijtβi

j=1 exp x>

ijtβi



yijt 



fq(βi)dβi,

where fq(βi) = N(βq,Σq). Due to the complex expression of the probability, gmnl imple-

ments the maximum likelihood estimator for the MM-MNL parameters with the Monte-Carlo

approximation of this choice probability and the analytical expression of the gradient.

2.3. Generalized multinomial logit model

Fiebig et al. (2010) proposed a general version of the MIXL model, which they called the

G-MNL model, where the parameters vary across individuals according to:

βi=σiβ+ [γ+σi(1 −γ)] Lηi,(4)

where σiis the individual-speciﬁc scale of the idiosyncratic error term, and γis a scalar

parameter that controls how the variance of residual taste heterogeneity Lηivaries with

scale. To better understand this speciﬁcation, it is useful to note that diﬀering sub-models

arise when some structural parameters in the G-MNL model are constrained:

•G-MNL-I: If γ= 1, then βi=σiβ+Lηi. In this model, the residual taste heterogeneity

is independent of the scaling of β.

•G-MNL-II: If γ= 0, then βi=σi(β+Lηi). In this model, the residual taste hetero-

geneity is proportional to σi.

•S-MNL: If VAR(ηi) = 0, then βi=σiβ. As pointed out by Fiebig et al. (2010), this

model is observationally equivalent to the particular type of heterogeneity in which the

parameters increase or decrease proportionally across individuals by the scaling factor

σi. S-MNL provides a more parsimonious representation of continuous heterogeneity

than MIXL, because βσihas fewer parameters than β+Lηi(Fiebig et al. 2010).

•MIXL: βi=β+Lηi, if σi= 1.

•MNL: βi=β, if σi= 1 and VAR(ηi) = 0.

Fiebig et al. (2010) note that some restrictions need to be considered to estimate the G-MNL

model. First, the domain of σishould be the positive real line. A positive scale parameter is

ensured by assuming that σiis distributed log-normal with standard deviation τand mean ¯σ

Fiebig et al. (2010):

Mauricio Sarrias, Ricardo Daziano 9

σi= exp(¯σ+τ υi),

where υ∼N(0,1). Fiebig et al. (2010) also note that when τis too large, numerical problems

arise for extreme draws of υi. To avoid this numerical issue, the authors suggest to use a

truncated normal distribution for υiwith truncation at ±2, so that υ∼T N [−2,+2]. Greene

and Hensher (2010) found that constraining υiat −1.96 and +1.96 maintains the smoothness

of the estimator. Speciﬁcally, the authors used υir = Φ−1(0.025 + 0.95uir), where uir is a

draw from the standard uniform distribution. gmnl allows the user to choose between these

two ways of drawing from υi, using the argument typeR (see Section 3.3).

Note that the parameters ¯σ,τ, and βare not separately identiﬁed. Fiebig et al. (2010) suggest

that one can normalize the mean ¯σby setting:

¯σ=−log "1

i=1

exp (τυi)#.

Another important issue in G-MNL is the domain of γ. Initially, Fiebig et al. (2010) imposed

γ∈[0,1]. To constrain γin this interval, the authors used the logistic transformation:

γ=exp(γ∗)

1 + exp(γ∗),

and estimated γ∗. However, Keane and Wasi (2013) pointed out that both γ < 0 and γ > 1

still have meaningful behavioral interpretations. Thus, these authors estimate γdirectly.

gmnl allows to estimate γusing both procedures.

Finally, one can allow the mean of the scale to diﬀer across individuals by including individual-

speciﬁc characteristics. In this case the scale parameter can be written as:

σi= exp(¯σ+δsi+τ υi),

where siis a vector of attributes of individual i.

In terms of computation, all models, except for the LC and the MNL model, are estimated

in gmnl using the maximum simulated likelihood estimator (MSLE) and maxLik function

from maxLik package (Henningsen and Toomet 2011). All models are estimated using the

analytical gradient (instead of the numerical gradient). The MNL is estimated using also the

analytical Hessian.

For a complete derivation of the asymptotic properties of the MSLE and a more comprehensive

review of how to implement this estimator, see Train (2009), Lee (1992), Gourieroux and

Monfort (1997)orHajivassiliou and Ruud (1986).

3. The gmnl package

3.1. Format of data

10 gmnl Package in R

The function mlogit.data from mlogit is very useful to handle multinomial data formats.

gmnl thus uses the same class of data for estimation. If the user forgets to set the data in the

mlogit.data format, gmnl will give an error message and the estimation process will stop.

For illustration purposes, we use the Travel Mode data from the AER package (Kleiber and

Zeileis 2008). As quickly presented in the introduction, the TravelMode data contains actual

individual choices7among four transportation modes (air,train,bus and car) for travel

between the cities of Sydney and Melbourne in Australia.

R> data("TravelMode", package = "AER")

R> with(TravelMode, prop.table(table(mode[choice == "yes"])))

air train bus car

0.276 0.300 0.143 0.281

Each mode is characterized by four alternative-speciﬁc variables (wait,vcost,travel,gcost),8

and two individual-speciﬁc variables (income,size). The observed shares of each mode are

27.62% (air), 30% (train), 14.29% (bus), and 28.1% (car). More details about and examples

using this dataset can be found in Kleiber and Zeileis (2008).

R> head(TravelMode)

individual mode choice wait vcost travel gcost income size

1 1 air no 69 59 100 70 35 1

2 1 train no 34 31 372 71 35 1

3 1 bus no 35 25 417 70 35 1

4 1 car yes 0 10 180 30 35 1

5 2 air no 64 58 68 68 30 2

6 2 train no 44 31 354 84 30 2

As can be seen above, the data is in a “long”format (one row per available mode) and can be

transformed into the structure needed by gmnl using the mlogit.data in the following way:

R> library("mlogit")

R> TM <- mlogit.data(TravelMode, choice = "choice", shape = "long",

+ alt.levels = c("air", "train", "bus", "car"))

The argument choice indicates the choice made by the individuals; shape speciﬁes the original

format of the data; and alt.levels is a character vector that contains the name of the

alternatives. We show how to transform other kinds of data in the examples below. For a

more complete treatment of the data using mlogit.data function see Croissant (2012).

Before formal modeling, it is useful to summarize the (unconditional) relationship between

the travel mode and the regressors. In the next example, we reshape the data from long to

wide format:

7In the choice modeling literature, microdata that represents real choices is known as ‘revealed preference’

(RP) data.

8vcost denotes travel cost, whereas gcost represents a generalized cost that combines both vcost and

travel – which is in-vehicle travel time – using an exogenous value of time. wait is for waiting time.

Mauricio Sarrias, Ricardo Daziano 11

R> wide_TM <- reshape(TravelMode,

+ idvar = c("individual", "income", "size"),

+ timevar = "mode" ,

+ direction = "wide")

R> wide_TM$chosen_mode[wide_TM$choice.air == "yes"] <- "air"

R> wide_TM$chosen_mode[wide_TM$choice.car == "yes"] <- "car"

R> wide_TM$chosen_mode[wide_TM$choice.train == "yes"] <- "train"

R> wide_TM$chosen_mode[wide_TM$choice.bus == "yes"] <- "bus"

For the case-speciﬁc income variable, we get the following summary

R> library("plyr")

R> ddply(wide_TM, ~ chosen_mode,

+ summarize,

+ mean.income = mean(income))

chosen_mode mean.income

1 air 41.7

2 bus 29.7

3 car 42.2

4 train 23.1

On average, those individuals choosing train have the lowest income and those choosing car

have the highest. The relationship between the chosen travel mode and the alternative-speciﬁc

regressor vcost is summarized as follows (a similar analysis can be done for the rest of the

alternative-speciﬁc variables):

R> ddply(wide_TM, ~ chosen_mode, summarize,

+ mean.air = mean(vcost.air),

+ mean.car = mean(vcost.car),

+ mean.train = mean(vcost.train),

+ mean.bus = mean(vcost.bus))

chosen_mode mean.air mean.car mean.train mean.bus

1 air 97.6 23.4 58.5 34.3

2 bus 89.3 26.8 62.3 33.7

3 car 76.3 15.6 53.6 33.8

4 train 80.4 21.1 37.5 32.3

Note that these ﬁgures show that the chosen mode is not determined solely by travel cost.

The purpose of the choice model is precisely to determine the tradeoﬀs across attributes that

help to explain choices.

3.2. Formula interface

The speciﬁcation of Multinomial Logit models using gmnl is similar to that of mlogit and

mnlogit. In particular, we use the Rpackage Formula (Zeileis and Croissant 2010), which is

able to handle multi-part formulae.

12 gmnl Package in R

Consider the TravelMode data and suppose that we want to estimate a Multinomial Logit

model where the variables wait and vcost are alternative-speciﬁc variables with a generic co-

eﬃcient β;income is an individual-speciﬁc variable with an alternative speciﬁc coeﬃcient γj;

and the variable travel is alternative-speciﬁc variables with an alternative-speciﬁc coeﬃcient

δj. This is done using the following 3-part formula:

R> f1 <- choice ~ wait + vcost | income | travel

By default, the alternative-speciﬁc constants (ASC) for each alternative are included. They

can be omitted by adding +0 or -1 in the second part of the formula. For example:

R> f2 <- choice ~ wait + vcost | income + 0 | travel

R> f2 <- choice ~ wait + vcost | income - 1 | travel

Some parts may be omitted when there is no ambiguity. For instance, a model with only

individual speciﬁc variables can be speciﬁed as follows:

R> f3 <- choice ~ 0 | income + size | 0

R> f3 <- choice ~ 0 | income + size | 1

Similarly, a Conditional Logit model, that is, a model with alternative-speciﬁc variables with

a generic coeﬃcient β, can be speciﬁed using either of the following formula objects:

R> f4 <- choice ~ wait + vcost | 0

R> f4 <- choice ~ wait + vcost | 0 | 0

R> f4 <- choice ~ wait + vcost | -1 | 0

For other models, such as the MIXL, S-MNL, LC-MNL and MM-MNL model, we require

to use the fourth and ﬁfth part of the formula. As explained in Section 2.1,gmnl allows

incorporating observed heterogeneity in the mean of the random parameters. This can be

achieved by including individual-speciﬁc characteristics (income and size) in the fourth part

of the formula:

R> f5 <- choice ~ wait + vcost | 0 | 0 | income + size - 1

and then use the mvar argument to indicate how these two variables modify the mean of the

random parameters. For a more complete example see Section 3.4.

The ﬁfth part of the formula is reserved for either models with heterogeneity in the scale

parameter or models with latent classes. For example, an S-MNL or G-MNL model where the

scale varies across individuals by individual-speciﬁc characteristics can be speciﬁed as follows:

R> f6 <- choice ~ wait + vcost | 1 | 0 | 0 | income + size - 1

The same formulation can be used if a model with latent classes is estimated and both income

and size determine the class assignment.

Mauricio Sarrias, Ricardo Daziano 13

3.3. Estimating S-MNL models

In this example, we estimate an S-MNL model using the TravelMode data where the ASCs

are ﬁxed and not scaled. Fiebig et al. (2010) found that in a model where all attributes are

scaled — including the ASCs — the estimates often show a explosive behavior and the model

actually produces a worse ﬁt. The basic syntax for estimation is the following:

R> library("gmnl")

R> smnl.nh <- gmnl(choice ~ wait + vcost + travel | 1,

+ data = TM,

+ model = "smnl",

+ R = 30,

+ notscale = c(1, 1, 1, rep(0, 3)))

The following variables are not scaled:

[1] "train:(intercept)" "bus:(intercept)" "car:(intercept)"

Estimating SMNL model

The component | 1 in the formula means that the model is ﬁtted using ASCs for the J−1

alternatives. The main argument in the model is model = "smnl", which indicates to the

function that the user wants to estimate the S-MNL model (without random parameters).

The rest of the models allowed by gmnl are given in Table 2.R = 30 indicates that 30 Halton

draws are used to simulate the probabilities. Another important argument in this example is

notscale. This is a vector that indicates which variables will not be scaled (1 = not scaled

and 0 = scaled). Since the ASCs are always the ﬁrst variables entering in the model (if they

are speciﬁed using | 1 in the second part of formula) and only J−1 = 3 ASCs are created,

notscale = c(1, 1, 1, rep(0, 3)) implies that the constants will not be scaled.

Options for "model" Model

"mnl" Multinomial Logit Model

"mixl" Mixed Logit Model

"smnl" Scaled Multinomial Logit Model

"gmnl" Generalized Multinomial Logit Model

"lc" Latent Class Multinomial Logit Model

"mm" Mixed-Mixed Multinomial Logit Model

Table 2: Models supported by gmnl.

R> summary(smnl.nh)

Model estimated on: Wed Mar 02 10:35:15 2016

Call:

gmnl(formula = choice ~ wait + vcost + travel | 1, data = TM,

model = "smnl", R = 30, notscale = c(1, 1, 1, rep(0, 3)),

method = "bfgs")

14 gmnl Package in R

Frequencies of categories:

air train bus car

0.276 0.300 0.143 0.281

The estimation took: 0h:0m:3s

Coefficients:

Estimate Std. Error z-value Pr(>|z|)

train:(intercept) -1.25946 0.52091 -2.42 0.01561 *

bus:(intercept) -2.02267 0.64669 -3.13 0.00176 **

car:(intercept) -7.19922 1.42002 -5.07 4.0e-07 ***

wait -0.14085 0.02777 -5.07 3.9e-07 ***

vcost -0.02198 0.01196 -1.84 0.06615 .

travel -0.00481 0.00127 -3.78 0.00016 ***

tau 0.54412 0.16422 3.31 0.00092 ***

---

Signif. codes: 0 '***'0.001 '**'0.01 '*'0.05 '.'0.1 ' ' 1

Optimization of log-likelihood by BFGS maximization

Log Likelihood: -188

Number of observations: 210

Number of iterations: 54

Exit of MLE: successful convergence

Simulation based on 30 draws

The results report the point estimates for each variable. Note that all preference parameters

are statistically signiﬁcant and negative,9meaning that cost and the time components create

a ‘disutility’ to the traveler, so that people prefer modes that are both cheaper and faster.

τ, which represents the standard deviation of σi, is also signiﬁcant supporting the presence

of heterogeneous preferences by means of diﬀerent error variances for each individual. The

output also gives additional information about estimation. The model is estimated using the

BFGS procedure. Other optimization procedures such as the BHHH and Newton Raphson

(NR) can be called using the argument method passed to the maxLik function.

Another important point is that the number of observations reported by gmnl corresponds

to N/J if cross-sectional data is used, or N×T /J if panel data (repeated choice situations)

is used. Finally, it is always important to check all the details in the estimation output.

In our example, the output informs us that the convergence was achieved successfully. If

convergence fails, the analyst needs to revise identiﬁcation of the model, starting values,

estimation procedure, and measurement scale of the attributes. For further details about

potential convergence problems see Section 4.

In the next example, we allow the scale to diﬀer across individuals according to their income.

Basically, we assume that:

9vcost is signiﬁcant at the 10% level.

Mauricio Sarrias, Ricardo Daziano 15

σi= exp ( ¯σ+δincomeincomei+τ υi).

The syntax is very similar to our previous example, with minor changes in the formula

argument:

R> smnl.het <- gmnl(choice ~ wait + vcost + travel | 1 |

+ 0 | 0 | income - 1,

+ data = TM,

+ model = "smnl",

+ R = 30,

+ notscale = c(1, 1, 1, 0, 0, 0),

+ typeR = FALSE)

The following variables are not scaled:

[1] "train:(intercept)" "bus:(intercept)" "car:(intercept)"

Estimating SMNL model

The ﬁfth part of the formula is reserved for individual-speciﬁc variables that aﬀect scale. In

this example, we specify that the variable income and no constant are included in σi.

R> summary(smnl.het)

Model estimated on: Wed Mar 02 10:35:19 2016

Call:

gmnl(formula = choice ~ wait + vcost + travel | 1 | 0 | 0 | income -

1, data = TM, model = "smnl", R = 30, notscale = c(1, 1,

1, 0, 0, 0), typeR = FALSE, method = "bfgs")

Frequencies of categories:

air train bus car

0.276 0.300 0.143 0.281

The estimation took: 0h:0m:4s

Coefficients:

Estimate Std. Error z-value Pr(>|z|)

train:(intercept) -0.99226 0.49290 -2.01 0.04410 *

bus:(intercept) -1.69842 0.61673 -2.75 0.00589 **

car:(intercept) -6.92476 1.22665 -5.65 1.6e-08 ***

wait -0.11795 0.02253 -5.23 1.7e-07 ***

vcost -0.01489 0.01019 -1.46 0.14412

travel -0.00439 0.00111 -3.95 7.8e-05 ***

tau 0.54198 0.14234 3.81 0.00014 ***

het.income 0.00564 0.00339 1.66 0.09635 .

16 gmnl Package in R

---

Signif. codes: 0 '***'0.001 '**'0.01 '*'0.05 '.'0.1 ' ' 1

Optimization of log-likelihood by BFGS maximization

Log Likelihood: -185

Number of observations: 210

Number of iterations: 66

Exit of MLE: successful convergence

Simulation based on 30 draws

The results are similar to those of the previous example. However, vcost is no longer statis-

tically signiﬁcant. All the parameters for the variables that enter in the scale are preceded by

the string het. Thus, the coeﬃcient het.income corresponds to δincome, and is signiﬁcant at

the 10% level. This result indicates that the variability of the error term for each individual

depends on their income. Finally, the argument typeR determines the type of draws used for

the scale parameter. If TRUE, truncated normal draws are used for the scale parameter. In this

case, the function rtruncnorm of truncnorm (Trautmann, Steuer, Mersmann, and Bornkamp

2014) is used. If typeR = FALSE, as in this example, the procedure suggested by Greene and

Hensher (2010) is used. See Section 2.3 for more details.

Suppose now that we want to test the null hypothesis H0:δincome = 0. This test can

be performed using the function waldtest or lrtest from the package lmtest (Zeileis and

Hothorn 2002):

R> library("lmtest")

R> waldtest(smnl.nh, smnl.het)

Wald test

Model 1: choice ~ wait + vcost + travel | 1

Model 2: choice ~ wait + vcost + travel | 1 | 0 | 0 | income - 1

Res.Df Df Chisq Pr(>Chisq)

1 203

2 202 1 2.76 0.096 .

---

Signif. codes: 0 '***'0.001 '**'0.01 '*'0.05 '.'0.1 ' ' 1

R> lrtest(smnl.nh, smnl.het)

Likelihood ratio test

Model 1: choice ~ wait + vcost + travel | 1

Model 2: choice ~ wait + vcost + travel | 1 | 0 | 0 | income - 1

#Df LogLik Df Chisq Pr(>Chisq)

1 7 -188

2 8 -185 1 5.21 0.022 *

---

Signif. codes: 0 '***'0.001 '**'0.01 '*'0.05 '.'0.1 ' ' 1

Mauricio Sarrias, Ricardo Daziano 17

Both tests are asymptotically equivalent, but they can provide diﬀerent results for ﬁnite

samples. In this case, both reject the null hypothesis, but at diﬀerent signiﬁcance levels.

In some cases, one will need to decide among non-nested models. In such cases the LR, Wald

and Lagrange multiplier tests cannot be applied. However, one can use either the AIC or BIC

criteria to measure the relative quality of models. In general, given a set of candidate models

for the data, the preferred model is the one with the minimum AIC or BIC value. We can

obtain the AIC and BIC criteria by typing:

R> AIC(smnl.nh)

[1] 390

R> AIC(smnl.het)

[1] 387

R> BIC(smnl.nh)

[1] 414

R> BIC(smnl.het)

[1] 414

The results show that AIC favors the smnl.het model, whereas the BIC is not able to dis-

criminate between both models.

3.4. Estimating MIXL models

In the following examples we show how to estimate MIXL models using gmnl. The package

mlogit is very eﬃcient in estimating MIXL models. However, one advantage of using gmnl is

the inclusion of individual-speciﬁc variables to explain the mean of the random parameters

(see Equation 2). Other important expansions include the possibility of producing point and

interval estimates at the individual level, and the consideration of Johnson Sbheterogeneity

distributions.

If we assume that the coeﬃcients of travel and wait vary across individuals according to:

βtravel,i =β1+π11income +π12 size +σ1η1i

βwait,i =β2+π21income +σ2η2i,

where η1iis triangular and η2i∼N(0,1), the corresponding MIXL model is estimated by

typing:

18 gmnl Package in R

R> mixl.hier <- gmnl(choice ~ vcost + travel + wait | 1 |

+ 0 | income + size - 1,

+ data = TM,

+ model = "mixl",

+ ranp = c(travel = "t", wait = "n"),

+ mvar = list(travel = c("income","size"),

+ wait = c("income")),

+ R = 50,

+ haltons = list("primes" = c(2, 17),

+ "drop" = rep(19, 2)))

Estimating MIXL model

The argument model = "mixl" indicates that the MIXL model will be estimated. The dis-

tribution of the random coeﬃcients are speciﬁed by the argument ranp. The distributions

supported by gmnl are presented in Table 3.10 Note also that the fourth part of the formula

is reserved for all the variables that enter the mean of the random parameters. The argument

mvar indicates which variables enter each speciﬁc random parameter. For example, travel

= c("income","size") indicates that the mean of the travel coeﬃcient varies according

to income and size. Finally, haltons is relevant if ranp is not NULL. If haltons = NULL,

pseudo-random draws are used instead of Halton sequences. If haltons = NA, the ﬁrst K

primes are used to generate the Halton draws, where Kis the number of random parameters,

and 15 of the initial sequence of elements are dropped. Otherwise, haltons should be a list

with elements prime and drop. In this example we use the prime numbers 2 and 17, and we

drop the ﬁrst 19 elements for each series. For a further explanation of Halton draws see Train

(2009).

Shorthands Distributions

"n" Normal distribution

"ln" Log-normal distribution

"cn" Truncated (at zero) normal distribution

"t" Triangular distribution

"u" Uniform distribution

"sb" Johnson Sbdistribution

Table 3: Continuous distributions supported by gmnl.

R> summary(mixl.hier)

Model estimated on: Wed Mar 02 10:36:37 2016

10It is worth mentioning that given how the random parameters of the G-MNL model are constructed

(see Equation 4), the distributions allowed when model = "gmnl" are the normal, uniform, and triangular.

Similarly, when the model is estimated with correlated random parameters, only the normal distribution and

its transformations —log-normal and truncated normal— are allowed.

Mauricio Sarrias, Ricardo Daziano 19

Call:

gmnl(formula = choice ~ vcost + travel + wait | 1 | 0 | income +

size - 1, data = TM, model = "mixl", ranp = c(travel = "t",

wait = "n"), R = 50, haltons = list(primes = c(2, 17), drop = rep(19,

2)), mvar = list(travel = c("income", "size"), wait = c("income")),

method = "bfgs")

Frequencies of categories:

air train bus car

0.276 0.300 0.143 0.281

The estimation took: 0h:1m:18s

Coefficients:

Estimate Std. Error z-value Pr(>|z|)

train:(intercept) -1.70e-01 1.13e+00 -0.15 0.88005

bus:(intercept) -1.14e+00 1.23e+00 -0.93 0.35120

car:(intercept) -9.00e+00 2.52e+00 -3.57 0.00035 ***

vcost -3.07e-02 1.52e-02 -2.02 0.04288 *

travel -7.75e-03 3.29e-03 -2.35 0.01861 *

wait -1.53e-01 4.48e-02 -3.42 0.00062 ***

travel.income -1.59e-04 6.42e-05 -2.48 0.01324 *

travel.size 3.51e-03 1.37e-03 2.56 0.01057 *

wait.income -1.19e-03 6.48e-04 -1.84 0.06568 .

sd.travel 5.48e-03 4.02e-03 1.36 0.17288

sd.wait 8.41e-02 3.48e-02 2.41 0.01582 *

---

Signif. codes: 0 '***'0.001 '**'0.01 '*'0.05 '.'0.1 ' ' 1

Optimization of log-likelihood by BFGS maximization

Log Likelihood: -165

Number of observations: 210

Number of iterations: 357

Exit of MLE: successful convergence

Simulation based on 50 draws

The output shows the estimates in the following order: ﬁxed parameters, mean of the ran-

dom parameters, eﬀect of the variables that aﬀect the mean of the random parameters, and

ﬁnally the standard deviation/spread of the random parameters. Note that travel.income

corresponds to π11,travel.size corresponds to π12, and wait.income corresponds to π21 .

The parameters have the expected sign. Note that individuals with higher income are more

sensitive to in-vehicle time (due to the signiﬁcant and negative travel.income). Whereas

both components of time were assumed random, only waiting time appears as having signiﬁ-

cant variation in how it is perceived across travelers. This result may be due to income and

size explaining taste variations (observed heterogeneity).

We now estimate a correlated random parameter model. For this example, we will use the

20 gmnl Package in R

Electricity data from the mlogit package, which is a panel dataset. There are 4,308 obser-

vations in this microdata set, but only 361 individuals. The analyst designed 12 hypothetical

choice scenarios according to a discrete choice experiment,11 where four hypothetical elec-

tricity suppliers were described in terms of price – which could be ﬁxed (pf), time-of-day

rate (tod), or seasonal rate (seas); length of contract (cl); and being local (loc) or ‘well-

known’ (wk). The experimental design considered unlabeled alternatives, which means that

alternative-speciﬁc constants can be set at zero. This microdata is in a “wide format” (one

row describes all alternatives in a given choice situation). Given time compilation restrictions,

in this example we will use just a subsample of this database (subset = 1:3000). The user

may want to use the whole sample to reproduce this case study.

R> data("Electricity", package = "mlogit")

R> Electr <- mlogit.data(Electricity, id.var = "id", choice = "choice",

+ varying = 3:26, shape = "wide", sep = "")

In this example, two arguments are especially relevant in the gmnl function. First, panel =

TRUE indicates that the data is a panel. When using panel data, the user needs to specify a

variable in the id.var argument of the mlogit.data function that identiﬁes the individual.

Second, to estimate correlated random parameters correlation = TRUE needs to be indicated

in the gmnl function. The syntax is the following:

R> Elec.cor <- gmnl(choice ~ pf + cl + loc + wk + tod + seas | 0,

+ data = Electr,

+ subset = 1:3000,

+ model = 'mixl',

+ R = 50,

+ panel = TRUE,

+ ranp = c(cl = "n", loc = "n", wk = "n",

+ tod = "n", seas = "n"),

+ correlation = TRUE)

Estimating MIXL model

R> summary(Elec.cor)

Model estimated on: Wed Mar 02 10:36:53 2016

Call:

gmnl(formula = choice ~ pf + cl + loc + wk + tod + seas | 0,

data = Electr, subset = 1:3000, model = "mixl", ranp = c(cl = "n",

loc = "n", wk = "n", tod = "n", seas = "n"), R = 50,

correlation = TRUE, panel = TRUE, method = "bfgs")

Frequencies of categories:

11Discrete choice experiments collect ‘stated preference’ (SP) data, where choices reﬂect intended behavior.

In this example, some respondents did not provide their choices for all 12 choice situations.

Mauricio Sarrias, Ricardo Daziano 21

1234

0.215 0.303 0.217 0.265

The estimation took: 0h:0m:16s

Coefficients:

Estimate Std. Error z-value Pr(>|z|)

pf -0.8702 0.0786 -11.07 < 2e-16 ***

cl -0.1765 0.0430 -4.11 4.0e-05 ***

loc 2.3822 0.3053 7.80 6.0e-15 ***

wk 1.9447 0.2493 7.80 6.2e-15 ***

tod -8.5026 0.7423 -11.45 < 2e-16 ***

seas -8.6456 0.7803 -11.08 < 2e-16 ***

sd.cl.cl 0.3919 0.0420 9.33 < 2e-16 ***

sd.cl.loc 0.4921 0.1983 2.48 0.01311 *

sd.cl.wk 0.5514 0.2131 2.59 0.00966 **

sd.cl.tod -0.9834 0.2802 -3.51 0.00045 ***

sd.cl.seas -0.1470 0.2297 -0.64 0.52206

sd.loc.loc 2.5925 0.4226 6.14 8.5e-10 ***

sd.loc.wk 1.9311 0.3610 5.35 8.8e-08 ***

sd.loc.tod 1.0198 0.5651 1.80 0.07114 .

sd.loc.seas 0.0941 0.4579 0.21 0.83723

sd.wk.wk -0.3330 0.2212 -1.51 0.13226

sd.wk.tod 1.9341 0.3208 6.03 1.7e-09 ***

sd.wk.seas 0.7349 0.3030 2.43 0.01529 *

sd.tod.tod 2.0635 0.3301 6.25 4.1e-10 ***

sd.tod.seas 1.1689 0.2539 4.60 4.2e-06 ***

sd.seas.seas 1.7034 0.2533 6.72 1.8e-11 ***

---

Signif. codes: 0 '***'0.001 '**'0.01 '*'0.05 '.'0.1 ' ' 1

Optimization of log-likelihood by BFGS maximization

Log Likelihood: -692

Number of observations: 750

Number of iterations: 97

Exit of MLE: successful convergence

Simulation based on 50 draws

Note that the estimates for the population means indicate that on average individuals prefer

contracts that are cheaper and shorter, and companies that are local and well-known. But,

there is great variation in preferences. The estimates from sd.cl.cl to sd.seas.seas are the

elements of the lower triangular matrix L. If the user is interested in the standard errors of the

variance-covariance matrix of the random parameters LL>=Σor the standard deviations,

the S3 function vcov can be used for ﬁnding these elements. The syntax for both cases is the

following:12

12To compute the standard errors, gmnl uses the deltamethod function from the msm package (Jackson

22 gmnl Package in R

R> vcov(Elec.cor, what = 'ranp', type = 'cov', se = 'true')

Elements of the variance-covariance matrix

Estimate Std. Error z-value Pr(>|z|)

v.cl.cl 0.1536 0.0329 4.67 3.1e-06 ***

v.cl.loc 0.1928 0.0816 2.36 0.0181 *

v.cl.wk 0.2161 0.0917 2.36 0.0185 *

v.cl.tod -0.3854 0.1290 -2.99 0.0028 **

v.cl.seas -0.0576 0.0906 -0.64 0.5249

v.loc.loc 6.9630 2.2065 3.16 0.0016 **

v.loc.wk 5.2776 1.6637 3.17 0.0015 **

v.loc.tod 2.1599 1.3323 1.62 0.1050

v.loc.seas 0.1715 1.1222 0.15 0.8785

v.wk.wk 4.1440 1.3293 3.12 0.0018 **

v.wk.tod 0.7832 0.8530 0.92 0.3585

v.wk.seas -0.1441 0.7525 -0.19 0.8481

v.tod.tod 10.0058 3.4763 2.88 0.0040 **

v.tod.seas 4.0739 1.5217 2.68 0.0074 **

v.seas.seas 4.8384 1.1851 4.08 4.5e-05 ***

---

Signif. codes: 0 '***'0.001 '**'0.01 '*'0.05 '.'0.1 ' ' 1

R> vcov(Elec.cor, what = 'ranp', type = 'sd', se = 'true')

Standard deviations of the random parameters

Estimate Std. Error z-value Pr(>|z|)

cl 0.392 0.042 9.33 < 2e-16 ***

loc 2.639 0.418 6.31 2.8e-10 ***

wk 2.036 0.327 6.23 4.5e-10 ***

tod 3.163 0.549 5.76 8.6e-09 ***

seas 2.200 0.269 8.17 2.2e-16 ***

---

Signif. codes: 0 '***'0.001 '**'0.01 '*'0.05 '.'0.1 ' ' 1

The correlation matrix of the random parameters can be recovered using the following syntax:

R> vcov(Elec.cor, what = 'ranp', type = 'cor')

cl loc wk tod seas

cl 1.0000 0.1865 0.2708 -0.311 -0.0668

loc 0.1865 1.0000 0.9825 0.259 0.0295

wk 0.2708 0.9825 1.0000 0.122 -0.0322

tod -0.3109 0.2588 0.1216 1.000 0.5855

seas -0.0668 0.0295 -0.0322 0.586 1.0000

2011).

Mauricio Sarrias, Ricardo Daziano 23

3.5. Estimating G-MNL models

In the following examples we show how to estimate G-MNL models in gmnl. Although we

do not need to specify constants in an unlabeled experiment (alternatives are fully deﬁned

by the experimental attributes), just for illustrative purposes we will assume that the ASCs

are random. Using the formula to create the ASCs produces problems in the ranp argument

due to the way the constants are labeled. So, we ﬁrst create the ASCs by hand :

R> Electr$asc2 <- as.numeric(Electr$alt == 2)

R> Electr$asc3 <- as.numeric(Electr$alt == 3)

R> Electr$asc4 <- as.numeric(Electr$alt == 4)

The G-MNL model is estimated using model = "gmnl":

R> Elec.gmnl <- gmnl(choice ~ pf + cl + loc + wk + tod + seas +

+ asc2 + asc3 + asc4 | 0,

+ data = Electr,

+ subset = 1:3000,

+ model = 'gmnl',

+ R = 50,

+ panel = TRUE,

+ notscale = c(rep(0, 6), 1, 1, 1),

+ ranp = c(cl = "n", loc = "n", wk = "n",

+ tod = "n", seas = "n",

+ asc2 = "n", asc3 = "n", asc4 = "n"))

The following variables are not scaled:

[1] "asc2" "asc3" "asc4"

Estimating GMNL model

R> summary(Elec.gmnl)

Model estimated on: Wed Mar 02 10:37:15 2016

Call:

gmnl(formula = choice ~ pf + cl + loc + wk + tod + seas + asc2 +

asc3 + asc4 | 0, data = Electr, subset = 1:3000, model = "gmnl",

ranp = c(cl = "n", loc = "n", wk = "n", tod = "n", seas = "n",

asc2 = "n", asc3 = "n", asc4 = "n"), R = 50, panel = TRUE,

notscale = c(rep(0, 6), 1, 1, 1), method = "bfgs")

Frequencies of categories:

1234

0.215 0.303 0.217 0.265

The estimation took: 0h:0m:22s

24 gmnl Package in R

Coefficients:

Estimate Std. Error z-value Pr(>|z|)

pf -0.8733 0.1066 -8.19 2.2e-16 ***

cl -0.1718 0.0422 -4.07 4.7e-05 ***

loc 1.8081 0.2289 7.90 2.9e-15 ***

wk 1.7543 0.2222 7.89 2.9e-15 ***

tod -8.5960 0.9865 -8.71 < 2e-16 ***

seas -8.8653 1.0128 -8.75 < 2e-16 ***

asc2 0.3044 0.1539 1.98 0.0479 *

asc3 0.1563 0.1598 0.98 0.3279

asc4 0.1133 0.1568 0.72 0.4698

sd.cl 0.3643 0.0442 8.25 2.2e-16 ***

sd.loc 1.1015 0.2738 4.02 5.7e-05 ***

sd.wk 1.2053 0.2493 4.84 1.3e-06 ***

sd.tod 1.4655 0.2336 6.28 3.5e-10 ***

sd.seas 1.8110 0.2958 6.12 9.3e-10 ***

sd.asc2 0.5264 0.1791 2.94 0.0033 **

sd.asc3 0.0849 0.2246 0.38 0.7055

sd.asc4 0.2407 0.1881 1.28 0.2007

tau 0.6777 0.1490 4.55 5.4e-06 ***

gamma 0.3625 0.1747 2.08 0.0380 *

---

Signif. codes: 0 '***'0.001 '**'0.01 '*'0.05 '.'0.1 ' ' 1

Optimization of log-likelihood by BFGS maximization

Log Likelihood: -729

Number of observations: 750

Number of iterations: 98

Exit of MLE: successful convergence

Simulation based on 50 draws

Since we are including the ASCs as additional variables, the second part of the formula does

not include the ASCs (| 0). Note also that even though the ASCs are random, they are not

scaled: notscale = c(rep(0, 6), 1, 1, 1) indicates that the last three variables in the

ﬁrst part of the formula (asc2,asc3, and asc4) are not scaled.

Another important issue is that gmnl estimates γdirectly by default as suggested by Keane

and Wasi (2013). However, one can estimate γ∗, where γ= exp(γ∗)/(1+exp(γ∗)) as suggested

by Fiebig et al. (2010), by specifying hgamma = "indirect". Thus, hgamma = "direct" is

the default setting.

The G-MNL estimation code is also very convenient when one wants to estimate S-MNL

models with random eﬀects (Keane and Wasi 2013). In this case, the user can ﬁx γand use

model = "gmnl".

R> Elec.smnl.re <- gmnl(choice ~ pf + cl + loc + wk + tod + seas +

+ asc2 + asc3 + asc4 | 0,

Mauricio Sarrias, Ricardo Daziano 25

+ data = Electr,

+ subset = 1:3000,

+ model = 'gmnl',

+ R = 50,

+ panel = TRUE,

+ print.init = TRUE,

+ notscale = c(rep(0, 6), 1, 1, 1),

+ ranp = c(asc2 = "n", asc3 = "n", asc4 = "n"),

+ init.gamma = 0,

+ fixed = c(rep(FALSE, 16), TRUE),

+ correlation = TRUE)

The following variables are not scaled:

[1] "asc2" "asc3" "asc4"

Starting Values:

pf cl loc wk tod

-0.6018 -0.1350 1.2223 1.0387 -5.3686

seas asc2 asc3 asc4 sd.asc2.asc2

-5.5623 0.2097 0.0811 0.1065 0.1000

sd.asc2.asc3 sd.asc2.asc4 sd.asc3.asc3 sd.asc3.asc4 sd.asc4.asc4

0.1000 0.1000 0.1000 0.1000 0.1000

tau gamma

0.1000 0.0000

Estimating GMNL model

R> summary(Elec.smnl.re)

Model estimated on: Wed Mar 02 10:37:31 2016

Call:

gmnl(formula = choice ~ pf + cl + loc + wk + tod + seas + asc2 +

asc3 + asc4 | 0, data = Electr, subset = 1:3000, model = "gmnl",

ranp = c(asc2 = "n", asc3 = "n", asc4 = "n"), R = 50, correlation = TRUE,

panel = TRUE, init.gamma = 0, notscale = c(rep(0, 6), 1,

1, 1), print.init = TRUE, fixed = c(rep(FALSE, 16), TRUE),

method = "bfgs")

Frequencies of categories:

1234

0.215 0.303 0.217 0.265

The estimation took: 0h:0m:16s

Coefficients:

Estimate Std. Error z-value Pr(>|z|)

26 gmnl Package in R

pf -0.6299 0.1148 -5.49 4.1e-08 ***

cl -0.1377 0.0328 -4.20 2.7e-05 ***

loc 1.2749 0.2200 5.80 6.8e-09 ***

wk 1.1119 0.1929 5.76 8.2e-09 ***

tod -6.2008 1.0450 -5.93 3.0e-09 ***

seas -6.3681 1.0802 -5.90 3.7e-09 ***

asc2 0.2124 0.1442 1.47 0.1409

asc3 0.2295 0.1351 1.70 0.0894 .

asc4 0.1536 0.1310 1.17 0.2410

sd.asc2.asc2 0.5694 0.2184 2.61 0.0091 **

sd.asc2.asc3 0.3066 0.1813 1.69 0.0908 .

sd.asc2.asc4 0.1508 0.1995 0.76 0.4497

sd.asc3.asc3 0.0445 0.2192 0.20 0.8390

sd.asc3.asc4 0.0893 0.2188 0.41 0.6832

sd.asc4.asc4 -0.0406 0.2034 -0.20 0.8419

tau 1.1009 0.1852 5.94 2.8e-09 ***

---

Signif. codes: 0 '***'0.001 '**'0.01 '*'0.05 '.'0.1 ' ' 1

Optimization of log-likelihood by BFGS maximization

Log Likelihood: -841

Number of observations: 750

Number of iterations: 76

Exit of MLE: successful convergence

Simulation based on 50 draws

The argument init.gamma indicates the initial value for γ. In this case we set it at zero.

The next step is to set the parameters that are ﬁxed by using the argument fixed, which

is passed to the maxLik function. Note that the user needs to be careful with the order of

the parameters. We encourage the user to estimate ﬁrst a model where all the parameters

are freely estimated with the argument print.init = TRUE. This argument will display the

initial values and the order used by gmnl. Generally, γis the last parameter that enters

the likelihood speciﬁcation. So, by typing fixed = c(rep(FALSE, 16), TRUE) we are only

holding γﬁxed at zero, and the rest of the coeﬃcients are freely estimated.

By default, the initial values for the mean of the random parameters come from an MNL, and

the standard deviations or spread are set at 0.1. However, the starting values from an MNL

model may not be the best guess, since the G-MNL model is not globally concave. The best

starting values for a G-MNL model with correlated parameters might be: 1) G-MNL with

uncorrelated parameters, 2) MIXL with correlated parameters, or 3) GMNL with correlated

parameters with γﬁxed at 0. One can ﬁrst get these initial parameters and then use the

start argument of gmnl to indicate the vector of appropriate starting values (see Section 3.7

for an example of how to use the start argument).

3.6. Estimating LC and MM-MNL models

The next example shows how an LC model with two classes can be estimated:

Mauricio Sarrias, Ricardo Daziano 27

R> Elec.lc <- gmnl(choice ~ pf + cl + loc + wk + tod + seas | 0 |

+ 0|0|1,

+ data = Electr,

+ subset = 1:3000,

+ model = 'lc',

+ panel = TRUE,

+ Q = 2)

Estimating LC model

Note that for the LC model, one needs to specify at least a constant in the ﬁfth part of the

formula. If the class assignment wiq is also determined by socio-economic characteristics,

those covariates can also be included in the ﬁfth part. The LC model is estimated by typing

model = "lc", and the prespeciﬁed number of classes is indicated with the argument Q.

R> summary(Elec.lc)

Model estimated on: Wed Mar 02 10:37:32 2016

Call:

gmnl(formula = choice ~ pf + cl + loc + wk + tod + seas | 0 |

0 | 0 | 1, data = Electr, subset = 1:3000, model = "lc",

Q = 2, panel = TRUE, method = "bfgs")

Frequencies of categories:

1234

0.215 0.303 0.217 0.265

The estimation took: 0h:0m:0s

Coefficients:

Estimate Std. Error z-value Pr(>|z|)

class.1.pf -0.4458 0.0876 -5.09 3.6e-07 ***

class.1.cl -0.1847 0.0301 -6.14 8.3e-10 ***

class.1.loc 1.2144 0.1618 7.50 6.2e-14 ***

class.1.wk 0.9641 0.1429 6.75 1.5e-11 ***

class.1.tod -3.2184 0.6880 -4.68 2.9e-06 ***

class.1.seas -3.4865 0.6929 -5.03 4.9e-07 ***

class.2.pf -0.8431 0.0968 -8.71 < 2e-16 ***

class.2.cl -0.1242 0.0453 -2.74 0.0061 **

class.2.loc 1.6445 0.2689 6.12 9.6e-10 ***

class.2.wk 1.4139 0.2120 6.67 2.6e-11 ***

class.2.tod -9.3732 0.8676 -10.80 < 2e-16 ***

class.2.seas -9.2647 0.8847 -10.47 < 2e-16 ***

(class)2 -0.2200 0.0788 -2.79 0.0052 **

---

28 gmnl Package in R

Signif. codes: 0 '***'0.001 '**'0.01 '*'0.05 '.'0.1 ' ' 1

Optimization of log-likelihood by BFGS maximization

Log Likelihood: -793

Number of observations: 750

Number of iterations: 77

Exit of MLE: successful convergence

Note that the underlying assumption in this example is that there are two types of cus-

tomers. For both types, the parameters have expected signs and are statistically signiﬁcant.

Probability of class assignment is assumed constant in this case, but in a dataset with sociode-

mographics, assignment to classes can vary for each individual using those sodiodemographics

as explanatory variables of the class assignment probability.

The following example estimates an MM-MNL with a mixture of two normal distributions:

R> Elec.mm <- gmnl(choice ~ pf + cl + loc + wk + tod + seas | 0 |

+ 0|0|1,

+ data = Electr,

+ subset = 1:3000,

+ model = 'mm',

+ R = 50,

+ panel = TRUE,

+ ranp = c(pf = "n", cl = "n", loc = "n",

+ wk = "n", tod = "n", seas = "n"),

+ Q = 2,

+ iterlim = 500)

Estimating MM-MNL model

R> summary(Elec.mm)

Model estimated on: Wed Mar 02 10:38:09 2016

Call:

gmnl(formula = choice ~ pf + cl + loc + wk + tod + seas | 0 |

0 | 0 | 1, data = Electr, subset = 1:3000, model = "mm",

ranp = c(pf = "n", cl = "n", loc = "n", wk = "n", tod = "n",

seas = "n"), R = 50, Q = 2, panel = TRUE, iterlim = 500,

method = "bfgs")

Frequencies of categories:

1234

0.215 0.303 0.217 0.265

The estimation took: 0h:0m:37s

Mauricio Sarrias, Ricardo Daziano 29

Coefficients:

Estimate Std. Error z-value Pr(>|z|)

class.1.pf -1.28036 0.12279 -10.43 < 2e-16 ***

class.1.cl -0.49715 0.08070 -6.16 7.3e-10 ***

class.1.loc 0.64445 0.24097 2.67 0.00749 **

class.1.wk 0.71241 0.21823 3.26 0.00110 **

class.1.tod -11.62474 1.02748 -11.31 < 2e-16 ***

class.1.seas -12.65698 1.13957 -11.11 < 2e-16 ***

class.2.pf -0.43575 0.11588 -3.76 0.00017 ***

class.2.cl 0.08464 0.08288 1.02 0.30713

class.2.loc 3.38611 0.38188 8.87 < 2e-16 ***

class.2.wk 2.72095 0.32688 8.32 < 2e-16 ***

class.2.tod -4.71637 1.05414 -4.47 7.7e-06 ***

class.2.seas -4.64753 0.96000 -4.84 1.3e-06 ***

class.1.sd.pf 0.10526 0.03938 2.67 0.00752 **

class.1.sd.cl 0.26906 0.05835 4.61 4.0e-06 ***

class.1.sd.loc 0.00764 0.26552 0.03 0.97704

class.1.sd.wk 0.11414 0.58053 0.20 0.84413

class.1.sd.tod 2.20533 0.45035 4.90 9.7e-07 ***

class.1.sd.seas 2.32243 0.45203 5.14 2.8e-07 ***

class.2.sd.pf 0.19696 0.03371 5.84 5.2e-09 ***

class.2.sd.cl 0.35523 0.07491 4.74 2.1e-06 ***

class.2.sd.loc 0.58717 0.28699 2.05 0.04076 *

class.2.sd.wk 1.10025 0.30323 3.63 0.00029 ***

class.2.sd.tod 1.40225 0.51875 2.70 0.00687 **

class.2.sd.seas 0.07309 0.25655 0.28 0.77572

(class)2 -0.13714 0.07844 -1.75 0.08039 .

---

Signif. codes: 0 '***'0.001 '**'0.01 '*'0.05 '.'0.1 ' ' 1

Optimization of log-likelihood by BFGS maximization

Log Likelihood: -672

Number of observations: 750

Number of iterations: 125

Exit of MLE: successful convergence

Simulation based on 50 draws

The speciﬁcation is similar to that of the LC model, but we now allow the parameters in each

class to be normally distributed using the argument ranp. It is worth mentioning that the

number of iterations required for this model is greater than that for previous models.13 For

13Another important issue is weak identiﬁcation. Large magnitudes of the coeﬃcients or standard errors,

or even models with slow convergence, might be a sign of weak identiﬁcation, especially in complex models

such as the MM-Logit model (Ruud 2007). Users should remain suspicious and further investigate such cases

by analyzing the Hessian matrix of second-order partial derivatives. If the Hessian matrix is positive deﬁnite

(e.g., all its eigenvalues are positive), the model is said to be locally identiﬁed (see Wedel and Kamakura 2012,

pag. 91). The Hessian can be obtained by typing my_model$logLik$hessian after estimating a model of class

30 gmnl Package in R

that reason we have set the maximum of iterations at 500 using the argument iterlim. The

example below adds the consideration of correlated parameters, which is the model typically

used in Bayesian treatments.

R> Elec.mm.c <- gmnl(choice ~ pf + cl + loc + wk + tod + seas | 0 |

+ 0|0|1,

+ data = Electr,

+ subset = 1:3000,

+ model = 'mm',

+ R = 50,

+ panel = TRUE,

+ ranp = c(pf = "n", cl = "n", loc = "n",

+ wk = "n", tod = "n", seas = "n"),

+ Q = 2,

+ iterlim = 500,

+ correlation = TRUE)

Estimating MM-MNL model

R> summary(Elec.mm.c)

Model estimated on: Wed Mar 02 10:39:42 2016

Call:

gmnl(formula = choice ~ pf + cl + loc + wk + tod + seas | 0 |

0 | 0 | 1, data = Electr, subset = 1:3000, model = "mm",

ranp = c(pf = "n", cl = "n", loc = "n", wk = "n", tod = "n",

seas = "n"), R = 50, Q = 2, correlation = TRUE, panel = TRUE,

iterlim = 500, method = "bfgs")

Frequencies of categories:

1234

0.215 0.303 0.217 0.265

The estimation took: 0h:1m:33s

Coefficients:

Estimate Std. Error z-value Pr(>|z|)

class.1.pf -0.9996 0.1311 -7.63 2.4e-14 ***

class.1.cl -0.2168 0.0545 -3.98 7.0e-05 ***

class.1.loc 2.7749 0.4689 5.92 3.3e-09 ***

class.1.wk 2.3225 0.3165 7.34 2.2e-13 ***

class.1.tod -9.8537 1.1690 -8.43 < 2e-16 ***

class.1.seas -9.5464 1.0975 -8.70 < 2e-16 ***

gmnl.

Mauricio Sarrias, Ricardo Daziano 31

class.2.pf -1.9182 0.4702 -4.08 4.5e-05 ***

class.2.cl -0.8815 0.1673 -5.27 1.4e-07 ***

class.2.loc 3.7753 1.0432 3.62 0.00030 ***

class.2.wk 3.0521 0.8006 3.81 0.00014 ***

class.2.tod -13.4974 3.6628 -3.68 0.00023 ***

class.2.seas -15.0237 3.6817 -4.08 4.5e-05 ***

class.1.sd.pf.pf 0.6324 0.1245 5.08 3.8e-07 ***

class.1.sd.pf.cl 0.2018 0.0522 3.87 0.00011 ***

class.1.sd.pf.loc 1.4214 0.3932 3.62 0.00030 ***

class.1.sd.pf.wk 1.0639 0.2577 4.13 3.6e-05 ***

class.1.sd.pf.tod 5.5303 1.0575 5.23 1.7e-07 ***

class.1.sd.pf.seas 4.3941 1.0112 4.35 1.4e-05 ***

class.1.sd.cl.cl 0.2116 0.0638 3.32 0.00091 ***

class.1.sd.cl.loc -1.0122 0.3917 -2.58 0.00976 **

class.1.sd.cl.wk -0.6071 0.2838 -2.14 0.03239 *

class.1.sd.cl.tod 2.0922 0.4038 5.18 2.2e-07 ***

class.1.sd.cl.seas 1.5087 0.3366 4.48 7.4e-06 ***

class.1.sd.loc.loc 1.3945 0.5937 2.35 0.01883 *

class.1.sd.loc.wk 0.8431 0.3562 2.37 0.01793 *

class.1.sd.loc.tod 0.5368 0.3483 1.54 0.12325

class.1.sd.loc.seas 0.6143 0.4411 1.39 0.16371

class.1.sd.wk.wk 0.5915 0.1638 3.61 0.00030 ***

class.1.sd.wk.tod 0.2274 0.2398 0.95 0.34301

class.1.sd.wk.seas 0.7723 0.2370 3.26 0.00112 **

class.1.sd.tod.tod 1.6883 0.5447 3.10 0.00194 **

class.1.sd.tod.seas 1.9974 0.4498 4.44 9.0e-06 ***

class.1.sd.seas.seas 0.0275 0.4979 0.06 0.95597

class.2.sd.pf.pf -1.0278 0.2735 -3.76 0.00017 ***

class.2.sd.pf.cl 0.3755 0.1017 3.69 0.00022 ***

class.2.sd.pf.loc 1.5795 0.5169 3.06 0.00224 **

class.2.sd.pf.wk 1.9678 0.5497 3.58 0.00034 ***

class.2.sd.pf.tod -7.1137 2.0080 -3.54 0.00040 ***

class.2.sd.pf.seas -6.0019 2.0065 -2.99 0.00278 **

class.2.sd.cl.cl 1.2556 0.2070 6.07 1.3e-09 ***

class.2.sd.cl.loc 0.0671 0.5375 0.12 0.90058

class.2.sd.cl.wk -0.5454 0.4364 -1.25 0.21141

class.2.sd.cl.tod -1.0686 0.5221 -2.05 0.04068 *

class.2.sd.cl.seas 4.0996 0.8738 4.69 2.7e-06 ***

class.2.sd.loc.loc 3.8107 0.9585 3.98 7.0e-05 ***

class.2.sd.loc.wk 3.3262 0.7860 4.23 2.3e-05 ***

class.2.sd.loc.tod 0.3255 0.6085 0.53 0.59279

class.2.sd.loc.seas 2.8477 0.9473 3.01 0.00265 **

class.2.sd.wk.wk -0.7910 0.3234 -2.45 0.01446 *

class.2.sd.wk.tod 0.7507 0.3089 2.43 0.01508 *

class.2.sd.wk.seas -1.6942 0.6594 -2.57 0.01019 *

class.2.sd.tod.tod -2.6490 0.6178 -4.29 1.8e-05 ***

class.2.sd.tod.seas -0.4250 0.6364 -0.67 0.50425

32 gmnl Package in R

class.2.sd.seas.seas 1.4412 0.4171 3.46 0.00055 ***

(class)2 -1.0232 0.0983 -10.40 < 2e-16 ***

---

Signif. codes: 0 '***'0.001 '**'0.01 '*'0.05 '.'0.1 ' ' 1

Optimization of log-likelihood by BFGS maximization

Log Likelihood: -640

Number of observations: 750

Number of iterations: 292

Exit of MLE: successful convergence

Simulation based on 50 draws

The standard deviations of the random parameters and their standard errors for each class

can be obtained using the vcov function in the following way

R> vcov(Elec.mm.c, what = "ranp", Q = 1, type = 'sd', se = TRUE)

Standard deviations of the random parameters

Estimate Std. Error z-value Pr(>|z|)

pf 0.6324 0.1245 5.08 3.8e-07 ***

cl 0.2924 0.0514 5.69 1.2e-08 ***

loc 2.2337 0.4054 5.51 3.6e-08 ***

wk 1.6004 0.3159 5.07 4.1e-07 ***

tod 6.1767 0.9809 6.30 3.0e-10 ***

seas 5.1525 0.9761 5.28 1.3e-07 ***

---

Signif. codes: 0 '***'0.001 '**'0.01 '*'0.05 '.'0.1 ' ' 1

R> vcov(Elec.mm.c, what = "ranp", Q = 2, type = 'sd', se = TRUE)

Standard deviations of the random parameters

Estimate Std. Error z-value Pr(>|z|)

pf 1.028 0.273 3.76 0.00017 ***

cl 1.311 0.209 6.26 3.8e-10 ***

loc 4.126 1.017 4.06 4.9e-05 ***

wk 3.982 0.885 4.50 6.8e-06 ***

tod 7.709 1.960 3.93 8.4e-05 ***

seas 8.128 1.853 4.39 1.2e-05 ***

---

Signif. codes: 0 '***'0.001 '**'0.01 '*'0.05 '.'0.1 ' ' 1

3.7. Willingness-to-pay space

Mauricio Sarrias, Ricardo Daziano 33

Willingness-to-pay space models reparameterize the parameter space in such a way that the

marginal WTP for each attribute (and the parameters of its heterogeneity distribution for a

random parameter model) is directly estimated rather than the marginal utility (preference

parameters). To motivate the WTP space model, consider the following latent utility

Uijt =−αpijt +x>

ijtβ+ijt,(5)

where pijt is the price coeﬃcient. This model is known as the model in preference space. The

utility in WTP-space is obtained by dividing the attribute’s coeﬃcients by the price coeﬃcient

in the following way

Uijt =−αpijt +x>

ijt −αβ

α+ijt

=−αpijt +x>

ijt (−αγ) + ijt,

where γis the WTP parameter vector, and αis ﬁxed and equal to 1. Although both,

preference and WTP-space are behaviorally equivalent, the latter approach is useful when

allowing for random heterogeneity in γ.

In eﬀect, the WTP-space approach is very appealing because it allows the analyst to specify

and estimate the distributions of WTP directly, rather than deriving them indirectly from

distributions of coeﬃcients in preference space model (Scarpa, Thiene, and Train 2008). In the

preference space model, the distribution of WTP is derived from the distribution of the ratio of

both αand β. However this ratio may not result in a well-speciﬁed distribution. For example,

if both αand βare normally distributed, then the ratio produces a Cauchy distribution with

no ﬁnite moments (Daly, Hess, and Train 2011). Motivated by this problem, Train and Weeks

(2005) and Sonnier, Ainslie, and Otter (2007) extended the WTP-space approach by allowing

γto follow any distribution and thus to avoid the problem of non-ﬁnite moments for the

distribution of WTP.

To illustrate the concept of WTP-space, and how it can be estimated using gmnl, we will ﬁrst

show the case without random parameters. The standard procedure to derive willingness-to-

pay measures is to start with a model in preference space, and then make inference on the

appropriate ratio that represents the marginal rate of substitution between a given attribute

and price. For example, consider the simple conditional logit model,

R> clogit <- gmnl(choice ~ pf + cl + loc + wk + tod + seas | 0,

+ data = Electr,

+ subset = 1:3000)

R> summary(clogit)

Model estimated on: Wed Mar 02 10:39:43 2016

Call:

gmnl(formula = choice ~ pf + cl + loc + wk + tod + seas | 0,

data = Electr, subset = 1:3000, method = "nr")

Frequencies of categories:

34 gmnl Package in R

1234

0.215 0.303 0.217 0.265

The estimation took: 0h:0m:0s

Coefficients:

Estimate Std. Error z-value Pr(>|z|)

pf -0.6113 0.0548 -11.15 < 2e-16 ***

cl -0.1398 0.0204 -6.85 7.2e-12 ***

loc 1.1986 0.1197 10.01 < 2e-16 ***

wk 1.0304 0.1063 9.69 < 2e-16 ***

tod -5.4540 0.4341 -12.56 < 2e-16 ***

seas -5.6648 0.4419 -12.82 < 2e-16 ***

---

Signif. codes: 0 '***'0.001 '**'0.01 '*'0.05 '.'0.1 ' ' 1

Optimization of log-likelihood by Newton-Raphson maximisation

Log Likelihood: -870

Number of observations: 750

Number of iterations: 4

Exit of MLE: gradient close to zero

To estimate the willingness to pay for each attribute, one needs to divide each attribute

parameter by that of price pf. This ratio can be easily retrieved using the function wtp.gmnl:

R> wtp.gmnl(clogit, wrt = "pf")

Willigness-to-pay respect to: pf

Estimate Std. Error t-value Pr(>|t|)

cl 0.2287 0.0358 6.38 1.8e-10 ***

loc -1.9610 0.2304 -8.51 < 2e-16 ***

wk -1.6858 0.1949 -8.65 < 2e-16 ***

tod 8.9226 0.2025 44.07 < 2e-16 ***

seas 9.2675 0.2164 42.83 < 2e-16 ***

---

Signif. codes: 0 '***'0.001 '**'0.01 '*'0.05 '.'0.1 ' ' 1

The argument wrt = "pf" indicates that all the parameters should be divided by the pa-

rameter of the attribute pf.14 Using the estimated ratios, we can say, for example, that a

individual with average price and contract lenght (cl) is willing to pay ≈0.23, or almost

one-ﬁfth a cent per kWh extra, to have a contract that is one year shorter.

14In the current version of gmnl, the standard error of the WTP estimates are calculated using the delta

method, which only works well when there are no problems of weak identiﬁcation in the ratio.

Mauricio Sarrias, Ricardo Daziano 35

Another way to estimate the same WTP coeﬃcients is to use the S-MNL model to derive

a speciﬁcation in WTP-space. To do so, we need ﬁrst to compute the negative of the price

attribute using the mlogit.data function:

R> ElectrO <- mlogit.data(Electricity, id = "id", choice = "choice",

+ varying = 3:26, shape = "wide", sep = "",

+ opposite = c("pf"))

Next, we need to set the values for the price parameter and τat 1 and 0, respectively. The

fixed argument is used to set these values.

R> start <- c(1, 0, 0, 0, 0, 0, 0, 0)

R> wtps <- gmnl(choice ~ pf + cl + loc + wk + tod + seas | 0 |

+ 0|0|1,

+ data = ElectrO,

+ model = "smnl",

+ subset = 1:3000,

+ R = 1,

+ fixed = c(TRUE, FALSE, FALSE, FALSE, FALSE,

+ FALSE, TRUE, FALSE),

+ panel = TRUE,

+ start = start,

+ method = "bhhh",

+ iterlim = 500)

Estimating SMNL model

Note that we ﬁtted the S-MNL model with a constant in the scale. This constant, after a

proper transformation, will represent the price parameter. Since we are working with a ﬁxed

parameter model, the number of draws is set equal to 1.

R> summary(wtps)

Model estimated on: Wed Mar 02 10:39:43 2016

Call:

gmnl(formula = choice ~ pf + cl + loc + wk + tod + seas | 0 |

0 | 0 | 1, data = ElectrO, subset = 1:3000, model = "smnl",

start = start, R = 1, panel = TRUE, fixed = c(TRUE, FALSE,

FALSE, FALSE, FALSE, FALSE, TRUE, FALSE), method = "bhhh",

iterlim = 500)

Frequencies of categories:

1234

0.215 0.303 0.217 0.265

36 gmnl Package in R

The estimation took: 0h:0m:1s

Coefficients:

Estimate Std. Error z-value Pr(>|z|)

cl -0.2287 0.0361 -6.34 2.4e-10 ***

loc 1.9609 0.2284 8.59 < 2e-16 ***

wk 1.6857 0.1915 8.80 < 2e-16 ***

tod -8.9226 0.2025 -44.06 < 2e-16 ***

seas -9.2675 0.2166 -42.79 < 2e-16 ***

het.(Intercept) -0.4922 0.0917 -5.37 7.9e-08 ***

---

Signif. codes: 0 '***'0.001 '**'0.01 '*'0.05 '.'0.1 ' ' 1

Optimization of log-likelihood by BHHH maximisation

Log Likelihood: -870

Number of observations: 750

Number of iterations: 14

Exit of MLE: successive function values within tolerance limit

Simulation based on 1 draws

Each value in the output represents the WTP estimates for each respective attribute. Note

that these WTP estimates are the same as those obtained using the wtp.gmnl function. The

price coeﬃcient can be obtained using the following transformation:

R> -exp(coef(wtps)["het.(Intercept)"])

het.(Intercept)

-0.611

If one requires the standard error for the price coeﬃcient the deltamethod function from the

msm (Jackson 2011) package can be used in the following way:

R> library("msm")

R> estmean <- coef(wtps)

R> estvar <- vcov(wtps)

R> se <- deltamethod(~ -exp(x6), estmean, estvar, ses = TRUE)

R> se

[1] 0.056

Using the same idea, one can let the WTP vary across individuals. To do so, we can estimate

a G-MNL where the parameter of price and γare ﬁxed as in the previous example:

R> start2 <- c(1, coef(wtps), rep(0.1, 5), 0.1, 0)

R> wtps2 <- gmnl(choice ~ pf + cl + loc + wk + tod + seas | 0 | 0 | 0 | 1,

Mauricio Sarrias, Ricardo Daziano 37

+ data = ElectrO,

+ subset = 1:3000,

+ model = "gmnl",

+ R = 50,

+ fixed = c(TRUE, rep(FALSE, 12), TRUE),

+ panel = TRUE,

+ start = start2,

+ ranp = c(cl = "n", loc = "n", wk = "n", tod = "n", seas = "n"))

Estimating GMNL model

R> summary(wtps2)

Model estimated on: Wed Mar 02 10:40:10 2016

Call:

gmnl(formula = choice ~ pf + cl + loc + wk + tod + seas | 0 |

0 | 0 | 1, data = ElectrO, subset = 1:3000, model = "gmnl",

start = start2, ranp = c(cl = "n", loc = "n", wk = "n", tod = "n",

seas = "n"), R = 50, panel = TRUE, fixed = c(TRUE, rep(FALSE,

12), TRUE), method = "bfgs")

Frequencies of categories:

1234

0.215 0.303 0.217 0.265

The estimation took: 0h:0m:26s

Coefficients:

Estimate Std. Error z-value Pr(>|z|)

cl -0.2727 0.0518 -5.26 1.4e-07 ***

loc 2.1631 0.2452 8.82 < 2e-16 ***

wk 1.9424 0.1935 10.04 < 2e-16 ***

tod -9.6782 0.2933 -32.99 < 2e-16 ***

seas -9.8866 0.2772 -35.67 < 2e-16 ***

het.(Intercept) 0.1142 0.1401 0.82 0.41

sd.cl 0.4115 0.0520 7.91 2.7e-15 ***

sd.loc 1.7850 0.2515 7.10 1.3e-12 ***

sd.wk 1.2865 0.2213 5.81 6.1e-09 ***

sd.tod 1.7174 0.2478 6.93 4.2e-12 ***

sd.seas 2.2146 0.3723 5.95 2.7e-09 ***

tau 0.6904 0.1394 4.95 7.3e-07 ***

---

Signif. codes: 0 '***'0.001 '**'0.01 '*'0.05 '.'0.1 ' ' 1

Optimization of log-likelihood by BFGS maximization

38 gmnl Package in R

Log Likelihood: -737

Number of observations: 750

Number of iterations: 143

Exit of MLE: successful convergence

Simulation based on 50 draws

Note that the model recast in WTP-space that is implemented in the G-MNL speciﬁcation

above allows the researcher to specify directly the heterogeneity distribution of WTP mea-

sures (Sonnier et al. 2007). When working in preference space, and then deriving WTP as

a parameter ratio, normally distributed WTP measures can only be derived if the price pa-

rameter is ﬁxed (i.e., no unobserved heterogeneity in the marginal utility of income, which is

a strong assumption) and the rest of the attributes are assumed to be normally distributed.

The problem with all normals in preference space is that the ratio of two normally distributed

parameters has a distribution with very long tails and without moments, which leads to

unexpected individual-level predictions.

Finally, a WTP-space model with correlated random parameters can be estimated in the

following way:

R> n_ran <- 5

R> start3 <- c(1, coef(wtps), rep(0.1, .5 * n_ran * (n_ran + 1)), 0.1, 0)

R> wtps3 <- gmnl(choice ~ pf + cl + loc + wk + tod + seas | 0 | 0 | 0 | 1,

+ data = ElectrO,

+ subset = 1:3000,

+ model = "gmnl",

+ R = 50,

+ fixed = c(TRUE, rep(FALSE, 22), TRUE),

+ panel = TRUE,

+ start = start3,

+ ranp = c(cl = "n", loc = "n", wk = "n", tod = "n", seas = "n"),

+ correlation = TRUE)

Estimating GMNL model

R> summary(wtps3)

Model estimated on: Wed Mar 02 10:40:50 2016

Call:

gmnl(formula = choice ~ pf + cl + loc + wk + tod + seas | 0 |

0 | 0 | 1, data = ElectrO, subset = 1:3000, model = "gmnl",

start = start3, ranp = c(cl = "n", loc = "n", wk = "n", tod = "n",

seas = "n"), R = 50, correlation = TRUE, panel = TRUE,

fixed = c(TRUE, rep(FALSE, 22), TRUE), method = "bfgs")

Frequencies of categories:

Mauricio Sarrias, Ricardo Daziano 39

1234

0.215 0.303 0.217 0.265

The estimation took: 0h:0m:40s

Coefficients:

Estimate Std. Error z-value Pr(>|z|)

cl -0.1968 0.0588 -3.35 0.00082 ***

loc 2.1682 0.3228 6.72 1.9e-11 ***

wk 1.8273 0.2523 7.24 4.4e-13 ***

tod -9.9662 0.3736 -26.68 < 2e-16 ***

seas -9.9928 0.3535 -28.27 < 2e-16 ***

het.(Intercept) -0.1385 0.0945 -1.47 0.14268

sd.cl.cl 0.4741 0.0625 7.59 3.2e-14 ***

sd.cl.loc 0.3903 0.2560 1.52 0.12730

sd.cl.wk 0.6118 0.2444 2.50 0.01231 *

sd.cl.tod -1.3139 0.2727 -4.82 1.5e-06 ***

sd.cl.seas -0.2438 0.2655 -0.92 0.35840

sd.loc.loc 2.5819 0.4937 5.23 1.7e-07 ***

sd.loc.wk 1.6391 0.4104 3.99 6.5e-05 ***

sd.loc.tod 1.9078 0.4725 4.04 5.4e-05 ***

sd.loc.seas 0.9679 0.3765 2.57 0.01014 *

sd.wk.wk -0.7259 0.2745 -2.64 0.00818 **

sd.wk.tod 2.2373 0.3745 5.97 2.3e-09 ***

sd.wk.seas 1.5985 0.3521 4.54 5.6e-06 ***

sd.tod.tod 2.5361 0.4169 6.08 1.2e-09 ***

sd.tod.seas 1.7418 0.3636 4.79 1.7e-06 ***

sd.seas.seas 2.0034 0.2872 6.98 3.0e-12 ***

tau -0.2557 0.1219 -2.10 0.03595 *

---

Signif. codes: 0 '***'0.001 '**'0.01 '*'0.05 '.'0.1 ' ' 1

Optimization of log-likelihood by BFGS maximization

Log Likelihood: -691

Number of observations: 750

Number of iterations: 174

Exit of MLE: successful convergence

Simulation based on 50 draws

Note that n_ran is the number of random coeﬃcients, which is used to compute the number

of initial values for the Lmatrix in the start3 vector of initial parameters: let this number

be Ka, then the number of elements is equal to (1/2) ·Ka·(Ka+ 1).

3.8. Individual parameters

Similarly to the Rchoice package (Sarrias 2015), gmnl also allows the analyst to get the

conditional estimates for each individual in the sample (see for example Train 2009;Greene

40 gmnl Package in R

2012). Using Bayes’ theorem we obtain

f(βi|yi,Xi,θ) = f(yi|Xi,βi)g(βi|θ)

Rβif(yi|Xi,βi)g(βi|θ)dβi

where f(βi|yi,Xi,θ) is the distribution of the individual parameters βiconditional on the

observed sequence of choices, and g(βi|θ) is the unconditional distribution. The conditional

expectation of βiis thus given by:

E[βi|yi,Xi,θ] = Rβiβif(yi|Xi,βi)g(βi|θ)dβi

Rβif(yi|Xi,βi)g(βi|θ)dβi

.(6)

The expectation in Equation 6gives us the conditional mean of the distribution of the random

parameters, which can also be interpreted as the posterior distribution of the individual pa-

rameters. Simulators for this conditional expectation are presented below for the continuous,

discrete and mixture cases, respectively:

b¯

βi=b

E[βi|yi,Xi,θ] =

RPR

r=1 b

βir Qtf(yit|xit,b

βir,b

θ)

RPR

r=1 Qtf(yit|xit,b

βir,b

θ)

b¯

βi=b

E[βi|yi,Xi,θq] = PQ

q=1 b

βqbwiq Qtf(yit|xit,b

βiq,b

θq)

q=1 bwiq Qtf(yit|xit,b

βiq,b

θq)

b¯

βi=b

E[βi|yi,Xi,θq] = PQ

q=1 bwiq 1

RPR

r=1 b

βiqr Qtf(yit|xit,b

βirq ,b

θq)

q=1 bwiq 1

RPR

r=1 Qtf(yit|xit,b

βiqr ,b

θq)

In order to construct the conﬁdence interval for b¯

βi, we can derive an estimator of the condi-

tional variance from the point estimates as follows (Greene 2012, chap. 15):

Vi=b

Eβ2

i|yi,Xi,θ−b

E[βi|yi,Xi,θ]2.(7)

An approximate normal-based 95% conﬁdence interval can be then constructed as b¯

βi±1.96×

V1/2

i. The gmnl package uses these formulae to compute the individual parameters along with

their 95% conﬁdence interval. However, it is worth mentioning that there are two shortcomings

with the procedure describe above for computing the conditional variance of βi. First, the

estimator in Equation 7is an estimator of the variance of the conditional distribution of

βi, and not an estimator of the sampling variance of the estimator of the expected value.

The estimated conditional variance will approach the estimated variance in the population

as the number of choice situations faced by each person increases without bound (Hensher,

Greene, and Rose 2006). Second, it does not take into account the sampling variability of the

parameter estimates.15

As an illustration, we can plot the kernel density of the individuals’ conditional mean for the

loc parameter using Elec.cor model by typing the following:

15Under the Bayesian framework, the estimation of the individual-level estimates fully accounts for uncer-

tainty in the population-level parameters in the estimation routine (see for example Daziano and Achtnicht

2014).

Mauricio Sarrias, Ricardo Daziano 41

R> plot(Elec.cor, par = "loc", effect = "ce", type = "density",

+ col = "grey")

Figure 1displays the distribution of the individuals’ conditional mean for the parameter of

loc. The gray area gives us the proportion of individuals with a positive conditional mean.

−202468

0.00 0.05 0.10 0.15

Conditional Distribution for loc

E(βi

Density

Figure 1: Kernel density of the individuals’ conditional mean.

The 95% conﬁdence interval of the conditional mean for the ﬁrst 30 individuals is shown in

Figure 2, which was plotted using the following syntax:16

R> plot(Elec.cor, par = "loc", effect = "ce", ind = TRUE, id = 1:30)

Another important function in gmnl is effect.gmnl. This function allows the users to get the

individuals’ conditional mean of both the preference parameters and the willingness-to-pay

measures.

For example, one can get the individual conditional mean and standard errors plotted in

Figure 2by typing:

16gmnl uses plotrix package (Lemon 2006) to create the conﬁdence interval graph.

42 gmnl Package in R

0 5 10 15 20 25 30

−2 0 2 4 6 8

95% Probability Intervals for loc

Individuals

E(βi

Figure 2: 95% conﬁdent interval for the conditional means.

R> bi.loc <- effect.gmnl(Elec.cor, par = "loc", effect = "ce")

R> summary(bi.loc$mean)

Min. 1st Qu. Median Mean 3rd Qu. Max.

-0.79 0.42 2.04 2.12 3.46 7.13

R> summary(bi.loc$sd.est)

Min. 1st Qu. Median Mean 3rd Qu. Max.

0.113 0.564 0.795 0.866 1.130 1.860

The conditional mean of the willingness to pay for“loc” (wtp = βi,loc /βpf ) for all individuals

in the sample can be obtained using:

R> wtp.loc <- effect.gmnl(Elec.cor, par = "loc", effect = "wtp", wrt = "pf")

Note that the argument par is the variable whose parameter goes in the numerator, and the

argument wrt is a string indicating which parameter goes in the denominator.

Mauricio Sarrias, Ricardo Daziano 43

R> summary(wtp.loc$mean)

Min. 1st Qu. Median Mean 3rd Qu. Max.

-8.19 -3.98 -2.35 -2.44 -0.48 0.91

R> summary(wtp.loc$sd.est)

Min. 1st Qu. Median Mean 3rd Qu. Max.

0.130 0.648 0.914 0.996 1.300 2.130

4. Computational issues

There are some issues about computation and convergence of maximum likelihood worth

mentioning before concluding this paper. Regarding models estimated by maximum simulated

likelihood, there are at least four factors inﬂuencing the estimation of the parameters. First,

if the draws used in the estimation are pseudo-random draws, instead of Halton draws, then

the parameters might change if the seed is changed. Second, the number of draws used in

the simulations is very important in order to have a good approximation of the likelihood.

In this paper, we used just a few draws due to time restrictions. Nevertheless, in applied

work researchers must use a greater number of draws, especially if pseudo-random draws are

used. The ‘standard rule’ is to increase the number of draws in each run until the estimates

stabilize. For a comprehensive review of the impact of the number of draws see for example

Bhat (2001) and S´andor and Train (2004).

Another important factor aﬀecting convergence and estimates is the starting values. It is

important to stress that the likelihood of the models reviewed in this article are complex and

are not globally concave. Thus, poor choice of initial values might lead to local maxima instead

of global maxima, or getting stuck in a ﬂat region of the loglikelihood due to numeric overruns

in the Hessian. If this is the case, the user of gmnl will receive the following message: Warning

message: In sqrt(diag(vcov(object))): NaNs produced. We encourage users to try

diﬀerent initial values using the argument start.

Finally, the algorithm used for optimization of the MSL is another important factor that

users should consider. gmnl uses the function maxLik to maximize the log-likelihood func-

tion, which implements the Newton-Raphson (NR), BGFS and Berndt-Hall-Hall-Hausman

(BHHH) procedures. As default, all models using simulation are estimated using the BFGS

algorithm. But if the estimation does not converge, users should try a diﬀerent algorithm. As

a caution note, gmnl uses the numerical Hessian if the NR algorithm is used. Thus, it can be

very slow compared to the other methods. BHHH is generally faster, but it might fail if the

variables have very diﬀerent scale. The larger the ratio between the largest standard deviation

and the smallest standard deviation of the variables, the more problems the user will have

with the estimation procedure. Therefore, users should check the variables and re-scale them

if necessary, and always look at the output message regarding convergence. It is good practice

to use the argument print.level = 2 to trace the optimization procedure in real time. For

more information about the arguments for optimization, type help(maxLik).

5. Conclusions

44 gmnl Package in R

The package gmnl implements the maximum likelihood estimator of random parameter logit

models with heterogeneity distributions that can be continuous, discrete, or discrete-continuous

mixtures. In this paper we have shown how gmnl can ﬁt several extensions to the standard

multinomial logit model, including the recently derived mixed-mixed multinomial logit (MM-

MNL). To our knowledge there is no other widely available statistical package that has imple-

mented the maximum simulated likelihood estimator of MM-MNL, and we want to highlight

that gmnl makes use of analytical expressions of the gradient. gmnl is also the ﬁrst imple-

mentation in Rof the estimator of the scale heterogeneity multinomial logit (S-MNL), the

generalized multinomial logit (G-MNL), and the latent class logit (LC). Whereas there are

other packages in Rfor the estimation of MIXL, gmnl allows for the inclusion of individual-

speciﬁc variables to explain the mean of the random parameters for a mixture of deterministic

taste variations and unobserved preference heterogeneity. In addition, gmnl also implements

Johnson Sbheterogeneity distributions.

Another key post-estimation functionality of gmnl that we have illustrated in this paper is

the derivation of conditional point and interval estimates of either the random parameters

or willingness-to-pay measures at the individual level. Random parameter models can be

used to make inference on the preference parameters of each individual in the sample, but

most packages that estimate MIXL models lack a command to produce individual-level esti-

mates. gmnl is able to compute individual parameters for all generalized logit models that

are implemented in the package, including G-MNL, MIXL, and LC.

Additional functionalities that we expect to incorporate in the future are the consideration of

diﬀerent choice sets for each individual and the implementation of diﬀerent methods for the

construction of conﬁdence intervals of willingness-to-pay measures.

Acknowledgments

We would like to express our gratitude to the two anonymous referees and the editor whose

comments greatly improved this paper and the package. We are also very grateful to all the

users of this package who have helped us to improve it through their questions and suggestions.

References

Bhat CR (2001). “Quasi-Random Maximum Simulated Likelihood Estimation of the Mixed

Multinomial Logit Model.” Transportation Research Part B: Methodological,35(7), 677–

693. doi:http://dx.doi.org/10.1016/S0191-2615(00)00014-X.

Boxall PC, Adamowicz WL (2002). “Understanding Heterogeneous Preferences in Random

Utility Models: A Latent Class Approach.” Environmental and Resource Economics,23(4),

421–446. doi:10.1023/A:1021351721619.

Bujosa A, Riera A, Hicks RL (2010). “Combining Discrete and Continuous Representations

of Preference Heterogeneity: A Latent Class Approach.” Environmental and Resource

Economics,47(4), 477–493. doi:10.1007/s10640-010-9389-y.

Croissant Y (2012). “Estimation of Multinomial Logit Models in R: The mlogit Packages.”

Mauricio Sarrias, Ricardo Daziano 45

Rpackage version 0.2-2. URL http://cran.r-project.org/web/packages/mlogit/

vignettes/mlogit.pdf.

Daly A, Hess S, Train K (2011). “Assuring Finite Moments for Willingness to Pay in

Random Coeﬃcient Models.” Transportation,39(1), 19–31. ISSN 1572-9435. doi:

10.1007/s11116-011-9331-3. URL http://dx.doi.org/10.1007/s11116-011-9331-3.

Daziano RA, Achtnicht M (2014). “Accounting for Uncertainty in Willingness to Pay for

Environmental Beneﬁts.” Energy Economics,44, 166 – 177. ISSN 0140-9883. doi:http:

//dx.doi.org/10.1016/j.eneco.2014.03.023. URL http://www.sciencedirect.com/

science/article/pii/S014098831400070X.

Dotson J, Brazell J, Howell J, Lenk P, Otter T, MacEachern S, Allenby G (2015). “A Probit

Model with Structured Covariance for Similarity Eﬀects and Source of Volume Calcula-

tions.” Available at SSRN: http://ssrn.com/abstract=1396232.

Dumont J, Keller J, Carpenter C (2014). RSGHB: Functions for Hierarchical Bayesian

Estimation: A Flexible Approach.Rpackage version 1.0.2, URL http://CRAN.R-project.

org/package=RSGHB.

Fiebig DG, Keane MP, Louviere J, Wasi N (2010). “The Generalized Multinomial Logit

Model: Accounting for Scale and Coeﬃcient Heterogeneity.” Marketing Science,29(3),

393–421. doi:http://dx.doi.org/10.1287/mksc.1090.0508.

Gourieroux C, Monfort A (1997). Simulation-based Econometric Methods. Oxford University

Press.

Greene WH (2012). Econometric Analysis. 7th edition. Prentice Hall.

Greene WH, Hensher DA (2003). “A Latent Class Model for Discrete Choice Analysis: Con-

trasts With Mixed Logit.” Transportation Research Part B: Methodological,37(8), 681–698.

doi:10.1016/S0191-2615(02)00046-2.

Greene WH, Hensher DA (2010). “Does Scale Heterogeneity Across Individuals Matter? An

Empirical Assessment of Alternative Logit Models.” Transportation,37(3), 413–428. doi:

10.1007/s11116-010-9259-z.

Greene WH, Hensher DA (2013). “Revealing Additional Dimensions of Preference Hetero-

geneity in a Latent Class Mixed Multinomial Logit Model.” Applied Economics,45(14),

1897–1902. doi:10.1080/00036846.2011.650325.

Hajivassiliou VA, Ruud PA (1986). “Classical Estimation Methods for LDV Models Using

Simulation.” In RF Engle, D McFadden (eds.), Handbook of Econometrics, volume 4 of

Handbook of Econometrics, chapter 40, pp. 2383–2441. Elsevier.

Hasan A, Zhiyu W, Mahani AS (2015). mnlogit: Multinomial Logit Model.Rpackage version

1.2.1, URL http://CRAN.R-project.org/package=mnlogit.

Henningsen A, Toomet O (2011). “maxLik: A Package for Maximum Likelihood Estimation

in R.” Computational Statistics,26(3), 443–458. doi:10.1007/s00180-010-0217-1.

46 gmnl Package in R

Hensher DA, Greene WH (2003). “The Mixed Logit Model: The State of Practice.” Trans-

portation,30(2), 133–176. doi:10.1023/A:1022558715350.

Hensher DA, Greene WH, Rose JM (2006). “Deriving Willingness-to-pay Estimates of Travel-

time Savings from Individual-based Parameters.” Environment and Planning A,38(12),

2365. doi:10.1068/a37395.

Hess S, Ben-Akiva M, Gopinath D, Walker J (2011). “Advantages of Latent Class Over

Continuous Mixture of Logit Models.” Working paper, Institute for Transport Studies,

University of Leeds.

Hess S, Rose JM (2012). “Can Scale and Coeﬃcient Heterogeneity be Separated in

Random Coeﬃcients Models?” Transportation,39(6), 1225–1239. doi:10.1007/

s11116-012-9394-9.

Hess S, Stathopoulos A (2013). “Linking Response Quality to Survey Engagement: A Com-

bined Random Scale and Latent Variable Approach.” Journal of Choice Modelling,7, 1–12.

doi:10.1016/j.jocm.2013.03.005.

Imai K, Dyk DAV (2005). “MNP:RPackage for Fitting the Multinomial Probit Model.”

Journal of Statistical Software,14(3), 1–32. doi:10.18637/jss.v014.i03. URL http:

//www.jstatsoft.org/v14/i03.

Jackson CH (2011). “Multi-State Models for Panel Data: The msm Package for R.” Journal

of Statistical Software,38(8), 1–29. doi:10.18637/jss.v038.i08. URL http://www.

jstatsoft.org/v38/i08/.

Keane M, Wasi N (2013). “Comparing Alternative Models of Heterogeneity in Consumer

Choice Behavior.” Journal of Applied Econometrics,28(6), 1018–1045. doi:10.1002/jae.

2304.

Kleiber C, Zeileis A (2008). Applied Econometrics with R. Springer-Verlag, New York. URL

http://CRAN.R-project.org/package=AER.

Lee LF (1992). “On Eﬃciency of Methods of Simulated Moments and Maximum Simulated

Likelihood Estimation of Discrete Response Models.” Econometric Theory,8(04), 518–552.

Leisch F (2004). “FlexMix: A General Framework for Finite Mixture Models and Latent

Class Regression in R.” Journal of Statistical Software,11(8), 1–18. doi:10.18637/jss.

v011.i08. URL http://www.jstatsoft.org/v11/i08/.

Lemon J (2006). “Plotrix: A Package in the Red Light District of R.” R-News,6(4), 8–12.

Linzer DA, Lewis JB (2011). “poLCA: An RPackage for Polytomous Variable Latent Class

Analysis.” Journal of Statistical Software,42(10), 1–29. doi:10.18637/jss.v042.i10.

URL http://www.jstatsoft.org/v42/i10/.

McFadden D (1974). “Conditional Logit Analysis of Qualitative Choice Behavior.”In P Zarem-

bka (ed.), Frontiers in Econometrics, pp. 105–142. Academic Press, New York.

McFadden D, Train K (2000). “Mixed MNL Models for Discrete Response.” Journal of

Applied Econometrics,15(5), 447–470. doi:10.1002/1099-1255(200009/10)15:5<447::

AID-JAE570>3.0.CO;2-1.

Mauricio Sarrias, Ricardo Daziano 47

RCore Team (2015). R: A Language and Environment for Statistical Computing.RFounda-

tion for Statistical Computing, Vienna, Austria. URL http://www.R-project.org.

Rossi P (2012). bayesm: Bayesian Inference for Marketing/Micro-econometrics.Rpackage

version 2.2-5, URL http://CRAN.R-project.org/package=bayesm.

Rossi P, Allenby G, McCulloch R (2005). Bayesian Statistics and Marketing. Wiley, Hoboken,

NJ.

Ruud PA (2007). “Estimating Mixtures of Discrete Choice Model.” Technical report.

S´andor Z, Train K (2004). “Quasi-Random Simulation of Discrete Choice Models.” Trans-

portation Research Part B: Methodological,38(4), 313–327. doi:http://dx.doi.org/10.

1016/S0191-2615(03)00014-6.

Sarrias M (2015). Rchoice: Discrete Choice (Binary, Poisson and Ordered) Models with

Random Parameters.Rpackage version 0.3, URL http://CRAN.R-project.org/package=

Rchoice.

Sarrias M, Daziano R (2015). gmnl: Multinomial Logit Models with Random Parameters.R

package version 1.1, URL http://CRAN.R-project.org/package=gmnl.

Scarpa R, Thiene M (2005). “Destination Choice Models for Rock Climbing in the Northeast-

ern Alps: A Latent-class Approach Based on Intensity of Preferences.” Land Economics,

81(3), 426–444. doi:10.3368/le.81.3.426.

Scarpa R, Thiene M, Train K (2008). “Utility in Willingness to Pay Space: A Tool to

Address Confounding Random Scale Eﬀects in Destination Choice to the Alps.” American

Journal of Agricultural Economics,90(4), 994–1010. doi:http://dx.doi.org/10.1111/

j.1467-8276.2008.01155.x.

Shen J (2009). “Latent Class Model or Mixed Logit Model? A Comparison by Transport Mode

Choice Data.” Applied Economics,41(22), 2915–2924. doi:10.1080/00036840801964633.

Sonnier G, Ainslie A, Otter T (2007). “Heterogeneity Distributions of Willingness-to-Pay in

Choice Models.” Quantitative Marketing and Economics,5(3), 313–331. doi:10.1007/

s11129-007-9024-6.

Train K (2009). Discrete Choice Methods with Simulation. 2nd edition. Cambridge University

Press.

Train K, Weeks M (2005). “Discrete Choice Models in Preference Space and Willingness-

to-Pay Space.” In R Scarpa, A Alberini (eds.), Applications of Simulation Methods in

Environmental and Resource Economics, volume 6 of The Economics of Non-Market Goods

and Resources, pp. 1–16. Springer Netherlands. ISBN 978-1-4020-3683-5. doi:10.1007/

1-4020-3684-1_1. URL http://dx.doi.org/10.1007/1-4020-3684-1_1.

Train KE (2008). “EM Algorithms for Nonparametric Estimation of Mixing Distributions.”

Journal of Choice Modelling,1(1), 40–69. doi:10.1016/S1755-5345(13)70022-8.

Trautmann H, Steuer D, Mersmann O, Bornkamp B (2014). truncnorm: Truncated Nor-

mal Distribution.Rpackage version 1.0-7, URL http://CRAN.R-project.org/package=

truncnorm.

48 gmnl Package in R

Venables WN, Ripley BD (2002). Modern Applied Statistics with S. Fourth edition. Springer-

Verlag, New York. ISBN 0-387-95457-0, URL http://www.stats.ox.ac.uk/pub/MASS4.

Wedel M, Kamakura WA (2012). Market Segmentation: Conceptual and Methodological Foun-

dations, volume 8. Springer Science & Business Media.

Yee TW (2010). “The VGAM Package for Categorical Data Analysis.” Journal of Statistical

Software,32(10), 1–34. doi:10.18637/jss.v032.i10.

Zeileis A, Croissant Y (2010). “Extended Model Formulas in R: Multiple Parts and Multiple

Responses.” Journal of Statistical Software,34(1), 1–13. URL http://www.jstatsoft.

org/v34/i01/.

Zeileis A, Hothorn T (2002). “Diagnostic Checking in Regression Relationships.” RNews,

2(3), 7–10. URL http://CRAN.R-project.org/doc/Rnews/.

Aﬃliation:

Mauricio Sarrias

Department of Economics

Universidad Cat´olica del Norte

0610 Avenida Angamos, Antofagasta, Chile

E-mail: msarrias86@gmail.com

URL: https://msarrias.weebly.com

Ricardo A. Daziano

School of Civil and Environmental Engineering

Cornell University

305 Hollister Hall, Ithaca, NY 14850, USA

E-mail : daziano@cornell.edu

URL: http://www.cee.cornell.edu/research/groups/daziano/

Understanding demand for broken rice and its potential food security implications in Colombia

Article

Full-text available

Jan 2024

Rice is a crucial contributor to global food security and is an important staple for over half the world's population. Irrigated paddy rice is a water-intensive crop, and an important contributor to greenhouse gas emissions. Thus, improving the efficiency of using rice as food rather than non-food uses is paramount to sustainably feeding a growing global population. One source of inefficiency in the rice market is using broken rice for non-food purposes. This study focuses on consumer preferences for rice with different broken percentages in Colombia. We used a mixed-method approach to ascertain the stated (experimental setting) and revealed (using samples that consumers independently purchased in a market) willingness to pay for broken rice to assess whether the rice market in Colombia efficiently prices rice quality. The findings highlight that consumers are aware of quality differences and are willing to pay a premium for rice with a low broken percentage, but also point to potential inefficiencies given that the willingness to pay estimates from both the two methods are statistically different. We find that the discount revealed in the market is significantly higher than that stated experimentally, which can have implications for pricing rice based on quality. Both methods found consumers were willing to pay a premium for rice under 10% broken, but beyond that threshold, there were no differences in willingness to pay. The Colombian rice industry and policymakers can use these findings to make the domestic rice market more responsive to the revealed preferences of consumers, which could have significant consequences for food security and sustainability.

Modeling Female Commuter Travel Mode Preferences for Nighttime Travel: Integrating Safety and Accessibility Factors

Conference Paper

Full-text available

Jan 2024

This study examines the travel mode choices of female commuters on the Dhaka-Mawa highway at night, focusing on safety and accessibility. A face-to-face survey was conducted to collect information on female commuters' preferred modes of transportation at night and their impressions of safety and accessibility for various modes of transportation. The ordered logit model was used to assess the likelihood of female commuters choosing different modes of transportation, including personal vehicles, public transport, ride-sharing services, walking, and bicycling. The findings provide important insights into the major effects of safety and accessibility on female commuters' nighttime mode of transportation choices. Safety perceptions, particularly for personal vehicles, public transportation, and ride-sharing services, are important in rural settings. Accessibility considerations for different forms of transportation were recognized as important predictors of choice. The research looks at the effectiveness of existing safety measures as well as proposed improvements to ensure the safety of nighttime commuters. It also evaluates the present degree of accessibility given by various forms of transportation and makes recommendations for changes to build a more inclusive and accessible transportation system for evening commuting. The integration of safety and accessibility issues in modeling nighttime travel mode preferences provides a complete framework for policymakers and urban planners to improve transportation infrastructure and services to satisfy the specific demands of female commuters at night. This research contributes to safer and more accessible transportation settings by identifying the factors that influence travel mode choices, eventually encouraging sustainable and inclusive rural mobility for all commuters.

Would consumers accept CRISPR fruit crops if the benefit has health implications? An application to cranberry products

Article

Full-text available

Jan 2024

Cranberry products are perceived as healthy due to their high antioxidant content yet adding sugars to increase their palatability deters consumption. Plant breeding technologies such as gene editing, specifically the clustered regularly interspaced palindromic repeats (CRISPR), offer a plausible alternative to develop cranberries with desired traits (e.g., lower acidity and increased sweetness). We estimated consumers’ willingness to pay for sugar content, CRISPR, and cranberry flavor intensity for two cranberry products under different health-related information treatments. Respondents stated a discount for regular sugar content favoring reduced sugar products, for CRISPR compared to conventional breeding, and for weak/bland compared to full/intense cranberry flavor. Compensated valuation analysis of products with different attribute levels indicates that consumers were willing to pay a premium for cranberry products with reduced sugar content, CRISPR-bred, and full/intense cranberry flavor relative to products with regular sugar content, conventionally bred, and weak/bland flavor. Information treatments highlighting cranberries’ health benefits and recommendations to limit sugar intake increased consumers’ discounts for regular sugar content, surpassing the discount for CRISPR. This research underscores the importance of the conditions under which breeding technologies might gain public acceptance. This information will benefit the scientific community and industry seeking to use CRISPR to develop improved cranberry cultivars.

Effects of the added sugar labeling on consumers' willingness to pay: The case of cranberry products under different nutrition‐related information treatments

Article

Full-text available

Apr 2024

The Food and Drug Administration announced a rule update to the Nutrition Facts Panel (NFP) requiring the declaration of added sugars on the NFP starting in 2020. This study measures the impact of these changes by estimating the willingness to pay for added sugars in cranberry products under different nutrition‐related information treatments. We found significant discounts for increases in added sugars that vary across information treatments and consumer subsamples. A positive information frame about the health benefits of cranberries was not found to consistently offset the impact of additional information on the recommended daily intake limits for added sugars.

Positive and negative information effects on consumer preferences for lab grown meat

Article

Full-text available

Dec 2023

We examine the effect of information framing on consumers’ preferences for In-vitro (or lab grown) meat (IVM). Our choice experiment uses eight choice tasks that vary across five attributes: production method (IVM or conventional), carbon trust label, organic label, animal welfare label, and price. We investigate four information treatments: (1) neutral (baseline), (2) positive, (3) negative, and (4) both positive and negative combined. Negative information framing leads consumers to require the largest discount to accept IVM, while positive information significantly reduces the discount required. Without positive information, food retailers should expect to offer steep discounts to attract customers to IVM.

A Longer Life or a Quality Death? A Discrete Choice Experiment to Estimate the Relative Importance of Different Aspects of End-of-Life Care in the United Kingdom

Article

Full-text available

May 2024

Background. Advocates argue that end-of-life (EOL) care is systematically disadvantaged by the quality-adjusted life-year (QALY) framework. By definition, EOL care is short duration and not primarily intended to extend survival; therefore, it may be inappropriate to value a time element. The QALY also neglects nonhealth dimensions such as dignity, control, and family relations, which may be more important at EOL. Together, these suggest the QALY may be a flawed measure of the value of EOL care. To test these arguments, we administered a stated preference survey in a UK-representative public sample. Methods. We designed a discrete choice experiment (DCE) to understand public preferences over different EOL scenarios, focusing on the relative importance of survival, conventional health dimensions (especially physical symptoms and anxiety), and nonhealth dimensions such as family relations, dignity, and sense of control. We used latent class analysis to understand preference heterogeneity. Results. A 4-class latent class multinomial logit model had the best fit and illustrated important heterogeneity. A small class of respondents strongly prioritized survival, whereas most respondents gave relatively little weight to survival and, generally speaking, prioritized nonhealth aspects. Conclusions. This DCE illustrates important heterogeneity in preferences within UK respondents. Despite some preferences for core elements of the QALY, we suggest that most respondents favored what has been called “a good death” over maximizing survival and find that respondents tended to prioritize nonhealth over conventional health aspects of quality. Together, this appears to support arguments that the QALY is a poor measure of the value of EOL care. We recommend moving away from health-related quality of life and toward a more holistic perspective on well-being in assessing EOL and other interventions. Highlights Advocates argue that some interventions, including but not limited to end-of-life (EOL) care, are valued by patients and the public but are systematically disadvantaged by the quality-adjusted life-year (QALY) framework, leading to an unfair and inefficient allocation of health care resources. Using a discrete choice experiment, we find some support for this argument. Only a small proportion of public respondents prioritized survival in EOL scenarios, and most prioritized nonhealth aspects such as dignity and family relations. Together, these results suggest that the QALY may be a poor measure of the value of EOL care, as it neglects nonhealth aspects of quality and well-being that appear to be important to people in hypothetical EOL scenarios.

Consumer Evaluation and Purchase Motivation of Locally Produced Agricultural Productsインショップ型地場農産物に対する消費者のイメージ形成とその購入動機: An Empirical Study by Semantic Differential and Best-Worst Scaling－Semantic DifferentialおよびBest-Worst Scalingによるアプローチ－

Article

Mar 2024

This study investigated how consumers evaluate and are motivated to purchase locally produced agricultural products (LPAP) at supermarkets using semantic differential (SD) and best–worst scaling (BWS). The result from the structural equation modeling estimation of SD scales demonstrates that the past purchasing experience of LPAP affects how consumers evaluate them. However, regarding the preference parameter for LPAP purchasing motivation, the random parameter logit estimation on BWS reveals significant effects of understanding relevant key words related to environmental friendly measures in agriculture.

Kmetovanje z(a) biodiverziteto na nižinskih kmetijah v Sloveniji - EIP VIVEK: Analiza izvedljivosti prenosa projektnih rešitev za izboljšanje stanja biotske pestrosti na nižinskih kmetijah v Sloveniji

Technical Report

Full-text available

Feb 2024

V okviru projekta EIP VIVEK smo razvili in preizkusili rezultatsko usmerjene in prostorsko ciljne ukrepe za izbrane vrste ptic, opraševalce in izbrane travniške habitatne tipe, s katerimi je mogoče povečati biodiverziteto v kmetijski krajini. Praktični preizkus in analiza izvedljivosti sta potekala na treh projektnih območjih (Goričko, Dravsko-Ptujsko-Središko polje in Ljubljansko barje). Namen tega poročila je predstaviti rezultate analize izvedljivosti v projektu razvitih in preizkušenih rešitev, ki je bila sestavljena iz kvalitativne in kvantitativne raziskave. V kvalitativnem delu smo izvedli štiri fokusne skupine s kmetijskimi svetovalci in drugimi strokovnjaki ter intervjuje z 48 kmeti iz vseh treh projektnih območij. V kvantitativnem delu smo s pomočjo anketiranja reprezentativnega vzorca 477 kmetij na Dravsko-Ptujskem polju in Ljubljanskem barju pridobili podatke o obstoječem stanju in pripravljenosti za prihodnje izvajanje prahe, lesnih krajinskih značilnosti in ekstenzivnih travnikov na ravni KMG. V okviru tega dela analize smo razvili tudi predlog nove celovite prostovoljne sheme, ki bi omogočala celovito ohranjanje in obnovo navedenih treh praks. S pomočjo poskusa diskretne izbire smo analizirali preference kmetov do takšne sheme in monetarno ovrednotili njihovo pripravljenost za vpis ob različnih pogojih.

Understanding demand for broken rice and its potential food security implications in Colombia

Article

Nov 2023

Study on Induced Passenger Flow Forecast for Intercity High-Speed Rail

Article

Nov 2023

To forecast the induced passenger flow of intercity high-speed rail, two related multinomial logit models are applied to describe travelers’ travel choice behavior under changes in the travel environment. The first model describes different choices of travel mode by travelers between origin–destination pairs, while the second model describes the choice of travel frequency by different travelers. By introducing the consumption surplus variable of travelers, the two selected models are correlated and the model is calibrated using data from a behavioral survey and an intention survey. The results show that the polynomial logit model is applicable for forecasting induced passenger flow after changes in the travel environment. The consumption surplus of travelers is the key factor influencing changes in travel frequency, and the social characteristics of travelers and the economic development features of origin and destination cities have significant impacts on travel frequency. In-transit time is the key factor affecting the demand elasticity of travel frequency. The range of demand elasticity for short-distance business and non-business travel is between −0.61 and −0.79. The results of the demand elasticity analysis were applied to predict that the travel frequency of people from Nanjing to Huai’an, Suqian, Lianyungang, and Yancheng will increase by about 45% after the opening of the Nanjing-Huai’an intercity high-speed railway.

Package 'gmnl': Multinomial Logit Models with Random Parameters in R

Technical Report

Full-text available

Mar 2015

Implementation of the maximum simulated likelihood method in R for the estimation of multinomial logit models with random parameters. The models supported by gmnl are the multinomial or conditional logit (MNL), the mixed multinomial logit (MIXL), the scale heterogeneity multinomial logit (S-MNL), the generalized multinomial logit (G-MNL), the latent class logit (LC), and the mixed-mixed multinomial logit (MM-MNL).

Revealing additional dimensions of preference heterogeneity in a latent class mixed multinomial logit model

Article

Jan 2012

Modern Applied Statistics with S

Book

Jan 2002

A guide to using S environments to perform statistical analyses providing both an introduction to the use of S and a course in modern statistical methods. The emphasis is on presenting practical problems and full analyses of real data sets.

Mixed MNL models for discrete response

Article

Jan 2000

bayesm: Bayesian Inference for Marketing/Micro-econometrics

Book

Jan 2012

Peter Rossi

Assuring finite moments for willingness to pay in random coefficients models

Article

Jan 2012

Random coefficient models such as mixed logit are increasingly being used to allow for random heterogeneity in willingness to pay (WTP) measures. In the most commonly used specifications, the distribution of WTP for an attribute is derived from the distribution of the ratio of individual coefficients. Since the cost coefficient enters the denominator, its distribution plays a major role in the distribution of WTP. Depending on the choice of distribution for the cost coefficient, and its implied range, the distribution of WTP may or may not have finite moments. In this paper, we identify a criterion to determine whether, with a given distribution for the cost coefficient, the distribution of WTP has finite moments. Using this criterion, we show that some popular distributions used for the cost coefficient in random coefficient models, including normal, truncated normal, uniform and triangular, imply infinite moments for the distribution of WTP, even if truncated or bounded at zero. We also point out that relying on simulation approaches to obtain moments of WTP from the estimated distribution of the cost and attribute coefficients can mask the issue by giving finite moments when the true ones are infinite.

Discrete Choice Methods With Simulation

Article

Feb 2016

Florian Heiss

Econometric Analysis

Article

Jan 2008

William H Greene

Team RDC.R: A Language And Environment For Statistical Computing. R Foundation for Statistical Computing: Vienna, Austria

Technical Report

Jan 2012

Core R Team

Applied Econometrics with R

Book

Jan 2008

This is the first book on applied econometrics using the R system for statistical computing and graphics. It presents hands-on examples for a wide range of econometric models, from classical linear regression models for cross-section, time series or panel data and the common non-linear models of microeconometrics such as logit, probit and tobit models, to recent semiparametric extensions. In addition, it provides a chapter on programming, including simulations, optimization, and an introduction to R tools enabling reproducible econometric research. An R package accompanying this book, AER, is available from the Comprehensive R Archive Network (CRAN). It contains some 100 data sets taken from a wide variety of sources, the full source code for all examples used in the text plus further worked examples, e.g., from popular textbooks. The data sets are suitable for illustrating, among other things, the fitting of wage equations, growth regressions, hedonic regressions, dynamic regressions and time series models as well as models of labor force participation or the demand for health care. The goal of this book is to provide a guide to R for users with a background in economics or the social sciences. Readers are assumed to have a background in basic statistics and econometrics at the undergraduate level. A large number of examples should make the book of interest to graduate students, researchers and practitioners alike.

Multinomial Logit Models with Continuous and Discrete Individual Heterogeneity in R : The gmnl Package

Abstract and Figures

Recommended publications

Discrete Choice Models with Random Parameters in R : The Rchoice Package

Individual-specific point and interval conditional estimates of latent class logit parameters

Correlation and scale in mixed logit models

Package 'gmnl': Multinomial Logit Models with Random Parameters in R

Exploring distinct sources of heterogeneity in discrete choice experiment: An application to wine ch...