Article

Bayesian Applications in Marketing

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

We review applications of Bayesian methods to marketing problems. Key aspects of marketing applications include the discreteness of response or outcome data and relatively large numbers of cross-sectional units, each with possibly low information content. The use of informative priors including hierarchical models is essential for successful Bayesian applications in marketing. Given the importance of the prior, it is important to assure flexibility in the prior specification. Non-standard likelihoods and flexible priors make marketing a very challenging area for Bayesian applications.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Remark 12.1. According to Rossi and Allenby (2009) and Rendon (2013), imposing prior distributions only for the parameters of the model leads to a fixed effects specification. Thus, there is not any prior specification for the hyper-parameters of the priors. ...
Preprint
Full-text available
This paper investigates the identification and estimation of dynamic heterogeneous linear models for unbalanced panel data with known clustering structure and short time dimension (greater than or equal to 3). For this purpose, we use a linear multidimensional panel data model with additive cluster fixed effects and a mixed coefficient structure composed of cluster specific fixed effects and random cluster-individual-time specific effects. For estimation of the mean coefficients, we propose a Mean Cluster-FGLS estimator and a Mean Cluster-OLS estimator. In order to make feasible the GLS estimation of the cluster specific parameters, we introduce a ridge es-timator of the variance-covariance matrix of the model. The Mean Cluster estimators are consistent: i) under stratified sampling when the number of clusters is fixed, the proportion of observed clusters is equal to 1 and the number of individuals per cluster grows to infinity or ii) under cluster sampling when the square root of the number of clusters grows at a slower rate than the growth rate of the number of individuals per cluster. In addition, we present two extensions of the baseline model. In the first one, we allow for cluster-individual specific fixed effects instead of cluster additive fixed effects. In this setting, we propose a Hierarchical Bayes estimator that takes into account the problem of unknown initial conditions. In the second extension, we allow for cross sectional dependence by including common factors. For estimation of this model, we propose the Mean Cluster estimator using the time demeaned variables. As an empirical application, we present the estimation of a value-added model of learning.
... Remark 12.1. According to Rossi and Allenby (2009) and Rendon (2013), imposing prior distributions only for the parameters of the model leads to a fixed effects specification. Thus, there is not any prior specification for the hyper-parameters of the priors. ...
Article
This paper investigates the identification and estimation of dynamic heterogeneous linear models for unbalanced panel data with known clustering structure and short time dimension (greater than or equal to 3). For this purpose, I use a linear multidimensional panel data model with additive cluster fixed effects and a mixed coefficient structure composed of cluster specific fixed effects and random cluster-individual-time specific effects. For estimation of the mean coefficients, I propose a Mean Cluster-FGLS estimator and a Mean Cluster-OLS estimator. In order to make feasible the GLS estimation of the cluster specific parameters, I introduce a ridge estimator of the variance-covariance matrix of the model. The Mean Cluster estimators are consistent when: i) the number of clusters is fixed, the proportion of observed clusters is equal to 1 and the number of individuals per cluster grows to infinity or when ii) the square root of the number of clusters grows at a slower rate than the the growth rate of the number individuals per cluster. In addition, I present two extensions of the baseline model. In the first one, I allow for cluster-individual specific fixed effects instead of cluster additive fixed effects. In this setting, I propose a Hierarchical Bayes estimator that takes into account the problem of unknown initial conditions. In the second extension, I allow for cross sectional dependence by including common factors. For estimation of this model, I propose the Mean Cluster estimator using the time demeaned variables. As an empirical application, I present the estimation of a value-added model of learning.
... Marketing researchers have for long been interested in studying the drivers of households' brand choices. This interest has typically manifested itself in analyzing scanner panel data to identify the impact of marketing activities of firms as well as characteristics of households on the choices made by households in a product category (see, e.g., the long literature in marketing beginning with Guadagni and Little 1983 and leading up to the recent survey by Allenby and Rossi 2009). More recently, researchers have also been interested in studying the extent to which these choices of households may be influenced by the context within which the product is consumed (e.g., is the brand of beer chosen by a consumer depend on whether the consumption occurred in a bar or at home) as well as the social situation associated with the consumption occasion (e.g., whether the beer was consumed when the person is alone or with a close friend, a partner, etc.). ...
Article
Full-text available
Using a unique dataset on U.S. beer consumption, we investigate brand preferences of consumers across various social group and context related consumption scenarios (“scenarios”). As sufficient data are not available for each scenario, understanding these preferences requires us to share information across scenarios. Our proposed modeling framework has two main building blocks. The first is a standard continuous random coefficients logit model that the framework reduces to in the absence of information on social groups and consumption contexts. The second component captures variations in mean preferences across scenarios in a parsimonious fashion by decomposing the deviations in preferences from a base scenario into a low dimensional brand map in which the brand locations are fixed across scenarios but the importance weights vary by scenario. In addition to heterogeneity in brand preferences that is reflected in the random coefficients, heterogeneity in preferences across scenarios is accounted for by allowing the brand map itself to have a discrete heterogeneity distribution across consumers. Finally, heterogeneity in preferences within a scenario is accounted for by allowing the importance weights to vary across consumers. Together, these factors allow us to parsimoniously account for preference heterogeneity across brands, consumers and scenarios. We conduct a simulation study to reassure ourselves that using the kind of data that is available to us, our proposed estimator can recover the true model parameters from those data. We find that brand preferences vary considerably across the different social groups and consumption contexts as well as across different consumer segments. Despite the sparse data on specific brand-scenario combinations, our approach facilitates such an analysis and assessment of the relative strengths of brands in each of these scenarios. This could provide useful guidance to the brand managers of the smaller brands whose overall preference level might be low but which enjoy a customer franchise in a particular segment or in a particular context or a social group setting.
Article
We use a flexible hierarchical Bayes approach to provide a method for developing a personalized consideration set recommender system. The proposed method determines which products to recommend and in what order to present these recommendations. We demonstrate our method in the context of internet retail for home appliances. The empirical results show that the proposed method offers significant advantages in terms of both hit measures and exploring preference distribution. The recommender system that we develop can be used to provide personalized consideration set suggestions based on consumer preferences at the abstract level and to generate a potential list of customers for new product messages. Implications and suggestions for future research are also provided.
Article
Waiting in queue seems an inevitable part of many leisure experiences. Larson (1987) suggests waiting, and especially waiting in queue, is often viewed as a negative experience. The purpose of this experiment was to explore participants’ reactions to waiting in line for a leisure event, and was concerned primarily with discovering how the unpleasant aspects of a queue might be reduced. Participants were placed in a hypothetical queue prior to a leisure event (a concert). Various scenarios focused selected staff interventions designed to minimize the drudgery of the queue. These interventions were concerned with (a) explanations for the delay, (b) engagement/entertainment while in queue, and (c) the provision of compensation (free pizza). Study participants disliked the queuing experience offered in this simulation. Though mood levels were uniformly positive prior to arriving at the event, they declined once the hypothetical queue experience began. Participants who received compensation, expressed greater satisfaction with both the wait and the actions of staff during the wait, however, all the participants seemed dissatisfied with both the wait and the service provider.
Article
Full-text available
The research agendas of psychologists and economists now have several overlaps, with behavioural economics providing theoretical and experimental study of the relationship between behaviour and choice, and hedonic psychology discussing appropriate measures of outcomes of choice in terms of overall utility or life satisfaction. Here we model the relationship between values (understood as principles guiding behaviour), choices and their final outcomes in terms of life satisfaction, and use data from the BHPS to assess whether our ideas on what is important in life (individual values) are broadly connected to what we experience as important in our lives (life satisfaction).
Article
Full-text available
Multiple response questions, also known as a pick any/J format, are frequently encountered in the analysis of survey data. The relationship among the responses is difficult to explore when the number of response options, J, is large. The authors propose a multivariate binomial probit model for analyzing multiple response data and use standard multivariate analysis techniques to conduct exploratory analysis on the latent multivariate normal distribution. A challenge of estimating the probit model is addressing identifying restrictions that lead to the covariance matrix specified with unit-diagonal elements (i.e., a correlation matrix). The authors propose a general approach to handling identifying restrictions and develop specific algorithms for the multivariate binomial probit model. The estimation algorithm is efficient and can easily accommodate many response options that are frequently encountered in the analysis of marketing data. The authors illustrate multivariate analysis of multiple response data in three applications.
Article
Full-text available
The authors present a Bayesian approach to the estimation of household parameters. Applied to the standard logit model, the procedure produces household-level estimates of all model parameters, enabling researchers to identify differences in household reaction to all variables in the marketing mix. Simulated data are used to study the small-sample performance of the estimator. The estimator can be easily implemented with standard algorithms used to maximize likelihood functions. In application to tuna scanner panel data, strong evidence of heterogeneity in price, display, and feature response (slope) parameters is detected. Approaches that fail to take into account slope heterogeneity are shown to underestimate the value of feature advertising and in-store displays in this dataset. In addition, the household price sensitivity estimates are strongly related to coupon usage and demonstrate how the estimates can be used to implement a targeted household drop.
Article
Full-text available
In this article, the authors propose a Bayesian method for estimating disaggregate choice models using aggregate data. Compared with existing methods, the advantage of the proposed method is that it allows for the analysis of microlevel consumer dynamic behavior, such as the impact of purchase history on current brand choice, when only aggregate-level data are available. The essence of this approach is to simulate latent choice data that are consistent with the observed aggregate data. When the augmented choice data are made available in each iteration of the Markov chain Monte Carlo algorithm, the dynamics of consumer buying behavior can be explicitly modeled. The authors first demonstrate the validity of the method with a series of simulations and then apply the method to an actual store-level data set of consumer purchases of refrigerated orange juice. The authors find a significant amount of dynamics in consumer buying behavior. The proposed method is useful for managers to understand better the consumer purchase dynamics and brand price competition when they have access to aggregate data only.
Article
Full-text available
In models of demand and supply, consumer price sensitivity affects both the sales of a good through price, and the price that is set by producers and retailers. The relationship between the dependent variables (e.g., demand and price) and the common parameters (e.g., price sensitivity) is typically non-linear, especially when heterogeneity is present. In this paper, we develop a Bayesian method to address the computational challenge of estimating simultaneous demand and supply models that can be applied to both the analysis of household panel data and aggregated demand data. The method is developed within the context of a heterogeneous discrete choice model coupled with pricing equations derived from either specific competitive structures, or linear equations of the kind used in instrumental variable estimation, and applied to a scanner panel dataset of light beer purchases. Our analysis indicates that incorporating heterogeneity into the demand model all but eliminates the bias in the price parameter due to the endogeneity of price. The analysis also supports the use of a full information analysis.
Article
Full-text available
This monograph provides a review of choice models in marketing from the perspective of a utility maximizing consumer subject to budgetary restrictions. Marketing models of choice have undergone many transformations over the last 20 years, and the advent to hierarchical Bayes models indicate that simple, theoretically grounded models work well when applied to understanding individual choices. Thus, we use economic theory to provide the foundation from which future trends are discussed. We begin our discussion with descriptive models of choice that raises a number of debatable issues for model improvement. We then look to economic theory as a basis for guiding model development, and conclude with a discussion of promising areas for future work.
Article
Full-text available
Amultinomial logit model of brand choice, calibrated on 32 weeks of purchases of regular ground coffee by 100 households, shows high statistical significance for the explanatory variables of brand loyalty, size loyalty, presence/absence of store promotion, regular shelf price and promotional price cut. The model is parsimonious in that the coefficients of these variables are modeled to be the same for all coffee brand-sizes. The calibrated model predicts remarkably well the share of purchases by brand-size in a hold-out sample of 100 households over the 32-week calibration period and a subsequent 20-week forecast period. The success of the model is attributed in part to the level of detail and completeness of the household panel data employed, which has been collected through optical scanning of the Universal Product Code in supermarkets. Three short-term market response measures are calculated from the model: regular (depromoted) price elasticity of share, percent increase in share for a promotion with a median price cut, and promotional price cut elasticity of share. Response varies across brand-sizes in a systematic way with large share brand-sizes showing less response in percentage terms but greater in absolute terms. On the basis of the model a quantitative picture emerges of groups of loyal customers who are relatively insensitive to marketing actions and a pool of switchers who are quite sensitive. This article was originally published in Marketing Science, Volume 2, Issue 3, pages 203–238, in 1983.
Article
Full-text available
Many theories of consumer behavior involve thresholds and discontinuities. In this paper, we investigate consumers' use of screening rules as part of a discrete-choice model. Alternatives that pass the screen are evaluated in a manner consistent with random utility theory; alternatives that do not pass the screen have a zero probability of being chosen. The proposed model accommodates conjunctive, disjunctive, and compensatory screening rules. We estimate a model that reflects a discontinuous decision process by employing the Bayesian technique of data augmentation and using Markov-chain Monte Carlo methods to integrate over the parameter space. The approach has minimal information requirements and can handle a large number of choice alternatives. The method is illustrated using a conjoint study of cameras. The results indicate that 92% of respondents screen alternatives on one or more attributes.
Article
Full-text available
For several of the largest supermarket product categories, such as carbonated soft drinks, canned soups, ready-to-eat cereals, and cookies, consumers regularly purchase assortments of products. Within the category, consumers often purchase multiple products and multiple units of each alternative selected on a given trip. This multiple discreteness violates the single-unit purchase assumption of multinomial logit and probit models. The misspecification of such demand models in categories exhibiting multiple discreteness would produce incorrect measures of consumer response to marketing mix variables. In studying product strategy, these models would lead to misleading managerial conclusions. We use an alternative microeconomic model of demand for categories that exhibit the multiple discreteness problem. Recognizing the separation between the time of purchase and the time of consumption, we model consumers purchasing bundles of goods in anticipation of a stream of consumption occasions before the next trip. We apply the model to a panel of household purchases for carbonated soft drinks.
Article
Full-text available
Consumers are often observed to purchase more than one variety of a product on a given shopping trip. The simultaneous demand for varieties is observed not only for packaged goods such as yogurt or soft drinks, but in many other product categories such as movies, music compact disks, and apparel. Multinomial (MN) choice models cannot be applied to data exhibiting the simultaneous choice of more than one variety. The random utility interpretation of either the MN logit or probit model uses a linear utility specification that cannot accommodate interior solutions with more than one variety (alternative) chosen. To analyze data with multiple varieties chosen requires a nonstandard utility specification. Standard demand models in the economics literature exhibit only interior solutions. We propose a demand model based on a translated additive utility structure. The model nests the linear utility structure, while allowing for the possibility of a mixture of corner and interior solutions where more than one but not all varieties are selected. We use a random utility specification in which the unobservable portion of marginal utility follows a log-normal distribution. The distribution of quantity demanded (the basis of the likelihood function) is derived from these log-normal random utility errors. The likelihood function for this class of models with mixtures of corner and interior solutions is a mixed distribution with both a continuous density portion and probability mass points for the corners. The probability mass points must be calculated by integrals of the log-normal errors over rectangular regions. We evaluate these high-dimensional integrals using the GHK approximation. We employ a Bayesian hierarchical model, allowing household-specific utility parameters. Our utility specification related to the approach of Wales and Woodland (1983) who employ a translated quadratic utility function. Wales and Woodland were only able to study, at the most, three varieties because there was no practical way to evaluate the utility function at that time. In addition, the quadratic utility specification is not a globally valid utility function, making welfare computations and policy experiments questionable. Hendel (1999) and Dube (1999) present an alternative approach in the utility function which is constructed by summing up over unobservable consumption occasions. While only one variety is consumed on each occasion, the marginal utilities of varieties change over the consumption occasions, giving rise to a simultaneous purchase of multiple varieties. Our Bayesian inference approach allows us to obtain individual household estimates of utility parameters. Household utility estimates are used to compute the value of each variety. We compute a compensating value for the removal of each flavor; that is, we compute the monetary equivalent of the household's loss in utility from removal of a flavor. These calculations show that households highly value popular flavors and would incur substantial utility losses from removal of these flavors from the yogurt assortment. Next we consider the implications of our model for retailer assortment and pricing policies. Given limited shelf space, only a subset of the possible varieties can be displayed for purchase at any one time. If consumers value variety, then a retailer with lower variety must compensate the consumers in some way, such as a lower price level. We see this trade-off between price and variety across different retailing formats. Discount or warehouse format retailers often have both lower variety and lower prices. To measure this trade-off, we explore the utility loss from reduction in variety and find the reductions in price that will compensate for this utility loss. These price reduction calculations must be based on a valid utility structure. Heterogeneity in tastes is critical in these utility computations and policy experiments. We find that a relatively small fraction of households with extreme preferences dominate the compensating value computations. That is, some households are observed to purchase mostly or exclusively one variety. These households must be heavily compensated for the removal of this variety from the assortment. In some retailing contexts, customization of the assortment is possible at the customer level. We show that such customization virtually eliminates any utility loss from reduction in variety.
Article
Full-text available
Marketing scholars and practitioners frequently infer market responses from cross-sectional or pooled cross-section by time data. Such cases occur especially when historical data are either absent or are not representative of the current market situation. We argue that inferring market responses using cross-sections of multimarket data may in some cases be misleading because these data also reflect unobserved actions by retailers. For example, because the (opportunity) costs of doing so do not outweigh the gains, retailers are predisposed against promoting small share brands. As a consequence, local prices and promotion variables depend on local market shares—the higher the local share, the higher the local observed promotion intensity. We refer to this reverse causation as an endogeneity. Ignoring it will inflate response estimates, because both the promotion effects on share as well as the reverse effects are in the same direction. In this paper, we propose a solution to this inference problem using the fact that retailers have trade territories consisting of multiple contiguous markets. This implies that the unobserved actions of retailers cause a measurable spatial dependence among the marketing variables. The intuition behind our approach is that by accounting for this spatial dependence, we account for the effects of the retailer's behavior. In this context, our study hopes to make the following contributions at the core of which lies the above intuition. First, we separate the market response effect from the reverse retailer effect by computing responses to price and promotion net of any spatial—and therefore retailer—influence. Second, underlying this approach is a new variance-decomposition model for data with a panel structure. This model allows to test for endogeneity of prices and promotion variables in the cross-sectional dimension of the data. This test aims to complement the one developed by Villas-Boas and Winer (1999), who test for endogeneity along the temporal dimension. Third, to illustrate the approach, we use Information Resources Inc. (IRI) market share data for brands in two mature and relatively undifferentiated product categories across 64 IRI markets. Whereas we only use data with very short time horizons to estimate price and promotion responses with the spatial model, we do have data over long time windows. We use the latter to validate the approach. Specifically, within-market estimates of price and promotion response are not subject to the same endogeneity because we hold the set of retailers constant. Therefore, comparing within- and across-market estimates of price and promotion responses is a natural way to validate the approach. Consistent with our argument, ignoring the reverse causation in the cross-sectional data leads to inferences of price and promotion elasticities that are farther away from zero than the elasticities obtained from within-market analysis. In contrast, cross-sectional spatial estimates and time-series estimates show convergent validity. From a practical point of view, this means it is possible to obtain reasonable within-market estimates of price and promotion elasticities from (predominantly) cross-sectional data. This may benefit marketing managers. The manager who would act on the inflated elasticities will over-allocate marketing resources to promotions because she ignores retailers' censorship of promotions on the basis of already existing high share. We explore other approaches to correct for the inference bias, and discuss further managerial issues and future research.
Article
Full-text available
An important aspect of marketing practice is the targeting of consumer segments for differential promotional activity. The premise of this activity is that there exist distinct segments of homogeneous consumers who can be identified by readily available demographic information. The increased availability of individual consumer panel data open the possibility of direct targeting of individual households. The goal of this paper is to assess the information content of various information sets available for direct marketing purposes. Information on the consumer is obtained from the current and past purchase history as well as demographic characteristics. We consider the situation in which the marketer may have access to a reasonably long purchase history which includes both the products purchased and information on the causal environment. Short of this complete purchase history, we also consider more limited information sets which consist of only the current purchase occasion or only information on past product choice without causal variables. Proper evaluation of this information requires a flexible model of heterogeneity which can accommodate observable and unobservable heterogeneity as well as produce household level inferences for targeting purposes. We develop new econometric methods to implement a random coefficient choice model in which the heterogeneity distribution is related to observable demographics. We couple this approach to modeling heterogeneity with a target couponing problem in which coupons are customized to specific households on the basis of various information sets. The couponing problem allows us to place a monetary value on the information sets. Our results indicate there exists a tremendous potential for improving the profitability of direct marketing efforts by more fully utilizing household purchase histories. Even rather short purchase histories can produce a net gain in revenue from target couponing which is 2.5 times the gain from blanket couponing. The most popular current electronic couponing trigger strategy uses only one observation to customize the delivery of coupons. Surprisingly, even the information contained in observing one purchase occasion boasts net couponing revenue by 50% more than that which would be gained by the blanket strategy. This result, coupled with increased competitive pressures, will force targeted marketing strategies to become much more prevalent in the future than they are today.
Article
Full-text available
Rotating indifference curves are used to induce an income effect that favors superior brands at the expense of inferior brands in a discrete choice model. When calibrated on scanner panel data, the model yields an objective measure of brand quality which is related to the rate of rotation. The model also leads to asymmetric responses to price promotions where switching up to high quality brands is more likely than switching down. The model is capable of nesting the standard logit model, and is similar to a nested logit model when there exists clusters of brands of like quality. The model is used to explore a product line pricing decision where profits are maximized subject to the constraint that consumer utility is maintained.
Book
Full-text available
This book describes the new generation of discrete choice methods, focusing on the many advances that are made possible by simulation. Researchers use these statistical methods to examine the choices that consumers, households, firms, and other agents make. Each of the major models is covered: logit, generalized extreme value, or GEV (including nested and cross-nested logits), probit, and mixed logit, plus a variety of specifications that build on these basics. Simulation-assisted estimation procedures are investigated and compared, including maximum simulated likelihood, method of simulated moments, and method of simulated scores. Procedures for drawing from densities are described, including variance reduction techniques such as anithetics and Halton draws. Recent advances in Bayesian procedures are explored, including the use of the Metropolis-Hastings algorithm and its variant Gibbs sampling. No other book incorporates all these fields, which have arisen in the past 20 years. The procedures are applicable in many fields, including energy, transportation, environmental studies, health, labor, and marketing.
Article
Full-text available
An extensive literature in econometrics and in numerical analysis has considered the problem of evaluating the multiple integral , where V is a m-dimensional normal vector with mean μ, covariance matrix Ω, and density n(v − μ, Ω), and 1(VϵB) is an indicator for the event B = (V¦a < V < b). A leading case of such an integral is the negative orthant probability, where B = (V ¦V < 0). The problem is computationally difficult except in very special cases. The multinomial probit (MNP) model used in econometrics and biometrics has cell probabilities that are negative orthant probabilities, with μ and Ω depending on unknown parameters (and, in general, on covariates). Estimation of this model requires, for each trial parameter vector and each observation in a sample, evaluation of P(B; μ, Ω) and of its derivatives with respect to μ and Ω. This paper surveys Monte Carlo techniques that have been developed for approximations of P(B; μ, Ω) and its linear and logarithmic derivatives, that limit computation while possessing properties that facilitate their use in iterative calculations for statistical inference: the Crude Frequency Simulator (CFS), Normal Importance Sampling (NIS), a Kernel-Smoothed Frequency Simulator (KFS), Stern's Decomposition Simulator (SDS), the Geweke-Hajivassiliou-Keane Simulator (GHK), a Parabolic Cylinder Function Simulator (PCF), Deák's Chi-squared Simulator (DCS), an Acceptance/Rejection Simulator (ARS), the Gibbs Sampler Simulator (GSS), a Sequentially Unbiased Simulator (SUS), and an Approximately Unbiased Simulator (AUS). We also discuss Gauss and FORTRAN implementations of these algorithms and present our computational experience with them. We find that GHK is overall the most reliable method.
Article
Full-text available
We investigate direct and indirect specification of the distribution of consumer willingness-to-pay (WTP) for changes in product attributes in a choice setting. Typically, choice models identify WTP for an attribute as a ratio of the estimated attribute and price coefficients. Previous research in marketing and economics has discussed the problems with allowing for random coefficients on both attribute and price, especially when the distribution of the price coefficient has mass near zero. These problems can be avoided by combining a parameterization of the likelihood function that directly identifies WTP with a normal prior for WTP. We show that the typical likelihood parameterization in combination with what are regarded as standard heterogeneity distributions for attribute and price coefficients results in poorly behaved posterior WTP distributions, especially in small sample settings. The implied prior for WTP readily allows for substantial mass in the tails of the distribution and extreme individual-level estimates of WTP. We also demonstrate the sensitivity of profit maximizing prices to parameterization and priors for WTP.
Article
Full-text available
The author develops a practical extension of D. McFadden's method of simulated moments estimator for limited dependent variable models to the panel data case. The method is based on factorization of the method of simulated moments first order condition into transition probabilities, along with development of an accurate new method for simulating transition probabilities. Monte Carlo tests indicate that this method of simulated moments estimator performs quite well relative to quadrature-based maximum likelihood estimators. It allows estimation of complex panel data models involving random effects and ARMA errors in computational time similar to those necessary for estimation of random effects models via maximum likelihood quadrature. Copyright 1994 by The Econometric Society.
Article
Full-text available
Questions that use a discrete ratings scale are commonplace in survey research. Examples in marketing include customer satisfaction measurement and purchase intention. Survey research practitioners have long commented that respondents vary in their usage of the scale: Common patterns include using only the middle of the scale or using the upper or lower end. These differences in scale usage can impart biases to correlation and regression analyses. To capture scale usage differences, we developed a new model with individual scale and location effects and a discrete outcome variable. We model the joint distribution of all ratings scale responses rather than specific univariate conditional distributions as in the ordinal probit model. We apply our model to a customer satisfaction survey and show that the correlation inferences are much different once proper adjustments are made for the discreteness of the data and scale usage. We also show that our adjusted or latent ratings scale is more closely related to actual purchase behavior.
Article
This article is divided into two parts. The first part considers flexible parametric models while the latter is nonparametric. It gives applications to regional growth data and semi parametric estimation of binomial proportions. It reviews methods for flexible mean regression, using either basis functions or Gaussian processes. This article also discusses Dirichlet processes and describes various posterior simulation algorithms for Bayesian nonparametric models. Usefulness is shown in empirical illustrations. Various applications as a function of income and as a cost function for electricity distribution are discussed. This article lists some freely available software that can accommodate many of the methods discussed. It provides a detailed discussion of both theory and computation for flexible treatment of distributions or functional forms or both.
Article
In this paper we develop new Markov chain Monte Carlo schemes for Bayesian esti- mation of DSGE models. The motivation for our work arises from some of the shortcom- ings of the single block random walk Metropolis Hastings (M-H) algorithm (RW-MH), the sampling method that has been used to date in this context. In our basic (TaB-MH) algorithm, the parameters of the model are randomly clustered at every iteration into a ar- bitrary number of blocks. Then each block is sequentially updated through an M-H step. Furthermore, the proposal density for each block is tailored to the location and curvature of the target density based on the output of a suitably formulated version of simulated annealing, following Chib and Greenberg (1994, 1995) and Chib and Ergashev (2008). We also provide an extension of this algorithm for sampling multi-modal distributions. In this version, which we refer to as the TaBMJ-MH algorithm, at a pre-specified move jumping iteration (say every 100th), a single-block proposal is generated from one of the modal regions using a mixture proposal density, and this proposal is then accepted ac- cording to an M-H probability of move. At the non-mode jumping iterations, the draws are obtained by applying the Tab-MH algorithm. The methodological developments are completed by showing how the approach in Chib (1995) and Chib and Jeliazkov (2001) can be adapted to these sampling schemes for estimating the model marginal likelihood. We illustrate our methods with the aid of stylized problems and two DSGE models that have appeared in the literature. The first is the model in Ireland (2004) where we show that the Tab-MH algorithm is more reliable and efficient than the RW-MH algorithm. Our second example is the model in An and Schorfheide (2007). The posterior distribution in this model is more challenging to simulate on account of multiple modes. As shown by these authors, the RW-MH algorithm is unable to jump from the low modal region to the high modal region, and vice-versa. The TaBMJ-MH method, on the other hand, does not suffer from this problem and moves between the two modal regions and explores the posterior distribution globally in an efficient manner.
Article
This paper provides a unified simulation-based Bayesian and non-Bayesian analysis of correlated binary data using the multivariate probit model. The posterior distribution is simulated by Markov chain Monte Carlo methods, and maximum likelihood estimates are obtained by a Monte Carlo version of the EM algorithm. Computation of Bayes factors from the simulation output is also considered. The methods are applied to a bivariate data set, to a 534-subject, four-year longitudinal data set from the Six Cities study of the health effects of air pollution, and to a seven-year data set on the labor supply of married women from the Panel Survey of Income Dynamics.
Article
In this article, the authors propose a Bayesian method for estimating disaggregate choice models using aggregate data. Compared with existing methods, the advantage of the proposed method is that it allows for the analysis of microlevel consumer dynamic behavior, such as the impact of purchase history on current brand choice, when only aggregate-level data are available. The essence of this approach is to simulate latent choice data that are consistent with the observed aggregate data. When the augmented choice data are made available in each iteration of the Markov chain Monte Carlo algorithm, the dynamics of consumer buying behavior can be explicitly modeled. The authors first demonstrate the validity of the method with a series of simulations and then apply the method to an actual store-level data set of consumer purchases of refrigerated orange juice. The authors find a significant amount of dynamics in consumer buying behavior. The proposed method is useful for managers to understand better the consumer purchase dynamics and brand price competition when they have access to aggregate data only.
Article
Sales response models are widely used as the basis for optimizing the marketing mix. Response models condition on the observed marketing-mix variables and focus on the specification of the distribution of observed sales given marketing-mix activities. The models usually fail to recognize that the levels of the marketing-mix variables are often chosen with at least partial knowledge of the response parameters in the condi-tional model. This means that contrary to standard assumptions, the marginal distribution of the marketing-mix variables is not independent of response parameters. The authors expand on the standard conditional model to include a model for the determination of the marketing-mix vari-ables. They apply this modeling approach to the problem of gauging the effectiveness of sales calls (details) to induce greater prescribing of drugs by individual physicians. They do not assume a priori that details are set optimally, but instead they infer the extent to which sales force managers have knowledge of responsiveness, and they use this knowl-edge to set the level of sales force contact. The authors find that their modeling approach improves the precision of the physician-specific response parameters significantly. They also find that physicians are not detailed optimally; high-volume physicians are detailed to a greater extent than low-volume physicians without regard to responsiveness to detailing. It appears that unresponsive but high-volume physicians are detailed the most. Finally, the authors illustrate how their approach provides a general framework.
Article
This article discusses the use of Bayesian methods for estimating logit demand models using aggregate data. We analyze two different demand systems: independent samples and consumer panel. Under the first system, there is a different and independent random sample of N consumers in each period and each consumer makes only a single purchase decision. Under the second system, the same N consumers make a purchase decision in each of T periods. Interestingly, there exists an asymptotic link between these two systems, which has important implications for the estimation of these demand models. The proposed methods are illustrated using simulated and real data. Copyright © 2009 John Wiley & Sons, Ltd.
Article
Consumers make multicategory decisions in a variety of contexts such as choice of multiple categories during a shopping trip or mail-order purchasing. The choice of one category may affect the selection of another category due to the complementary nature (e.g., cake mix and cake frosting) of the two categories. Alternatively, two categories may co-occur in a shopping basket not because they are complementary but because of similar purchase cycles (e.g., beer and diapers) or because of a host of other unobserved factors. While complementarity gives managers some control over consumers' buying behavior (e.g., a change in the price of cake mix could change the purchase probability of cake frosting), co-occurrence or co-incidence is less controllable. Other factors that may affect multi-category choice may be (unobserved) household preferences or (observed) household demographics. We also argue that not accounting for these three factors simultaneously could lead to erroneous inferences. We then develop a conceptual framework that incorporates complementarity, co-incidence and heterogeneity (both observed and unobserved) as the factors that could lead to multi-category choice. We then translate this framework into a model of multi-category choice. Our model is based on random utility theory and allows for simultaneous, interdependent choice of many items. This model, the multi variate probit model, is implemented in a Hierarchical Bayes framework. The hierarchy consists of three levels. The first level captures the choice of items for the shopping basket during a shopping trip. The second level captures differences across households and the third level specifies the priors for the unknown parameters. We generalize some recent advances in Markov chain Monte Carlo methods in order to estimate the model. Specifically, we use a substitution sampler which incorporates techniques such as the Metropolis Hit-and-Run algorithm and the Gibbs Sampler. The model is estimated on four categories (cake mix, cake frosting, fabric detergent and fabric softener) using multicategory panel data. The results disentangle the complementarity and co-incidence effects. The complementarity results show that pricing and promotional changes in one category affect purchase incidence in related product categories. In general, the cross-price and cross-promotion effects are smaller than the own-price and own-promotions effects. The cross-effects are also asymmetric across pairs of categories, i.e., related category pairs may be characterized as having a “primary” and a “secondary” category. Thus these results provide a more complete description of the effects of promotional changes by examining them both within and across categories. The co-incidence results show the extent of the relationship between categories that arises from uncontrollable and unobserved factors. These results are useful since they provide insights into a general structure of dependence relationships across categories. The heterogeneity results show that observed demographic factors such as family size influence the intrinsic category preference of households. Larger family sizes also tend to make households more price sensitive for both the primary and secondary categories. We find that price sensitivities across categories are not highly correlated at the household level. We also find some evidence that intrinsic preferences for cake mix and cake frosting are more closely related than preferences for fabric detergent and fabric softener. We compare our model with a series of null models using both estimation and holdout samples. We show that both complementarity and co-incidence play a significant role in predicting multicategory choice. We also show how many single-category models used in conjunction may not be good predictors of joint choice. Our results are likely to be of interest to retailers and manufacturers trying to optimize pricing and promotion strategies across many categories as well as in designing micromarketing strategies. We illustrate some of these benefits by carrying out an analysis which shows that the “true” impact of complementarity and co-incidence on profitability is significant in a retail setting. Our model can also be applied to other domains. The combination of item interdependence and individual household level estimates may be of particular interest to database marketers in building customized “cross-selling” strategies in the direct mail and financial service industries.
Article
We introduce a set of new Markov chain Monte Carlo algorithms for Bayesian analysis of the multinomial probit model. Our Bayesian representation of the model places a new, and possibly improper, prior distribution directly on the identifiable parameters and thus is relatively easy to interpret and use. Our algorithms, which are based on the method of marginal data augmentation, involve only draws from standard distributions and dominate other available Bayesian methods in that they are as quick to converge as the fastest methods but with a more attractive prior specification. C-code along with an R interface for our algorithms is publicly available.1
Article
We develop a Bayesian semi-parametric approach to the instrumental variable problem. We assume linear structural and reduced form equations, but model the error distributions non-parametrically. A Dirichlet process prior is used for the joint distribution of structural and instrumental variable equations errors. Our implementation of the Dirichlet process prior uses a normal distribution as a base model. It can therefore be interpreted as modeling the unknown joint distribution with a mixture of normal distributions with a variable number of mixture components. We demonstrate that this procedure is both feasible and sensible using actual and simulated data. Sampling experiments compare inferences from the non-parametric Bayesian procedure with those based on procedures from the recent literature on weak instrument asymptotics. When errors are non-normal, our procedure is more efficient than standard Bayesian or classical methods.
Article
Many consumer choice situations are characterized by the simultaneous demand for multiple alternatives that are imperfect substitutes for one another. A simple and parsimonious multiple discrete-continuous extreme value (MDCEV) econometric approach to handle such multiple discreteness was formulated by Bhat (2005) [Bhat, C.R., 2005. A multiple discrete-continuous extreme value model: formulation and application to discretionary time-use decisions. Transportation Research Part B 39(8), 679–707]. within the broader Kuhn–Tucker (KT) multiple discrete-continuous economic consumer demand model of Wales and Woodland (1983) [Wales, T.J., and Woodland, A.D., 1983. Estimation of consumer demand systems with binding non-negativity constraints. Journal of Econometrics 21(3), 263–85]. This paper examines several issues associated with the MDCEV model and other extant KT multiple discrete-continuous models. Specifically, the paper proposes a new utility function form that enables clarity in the role of each parameter in the utility specification, presents identification considerations associated with both the utility functional form as well as the stochastic nature of the utility specification, extends the MDCEV model to the case of price variation across goods and to general error covariance structures, discusses the relationship between earlier KT-based multiple discrete-continuous models, and illustrates the many technical nuances and identification considerations of the multiple discrete-continuous model structure through empirical examples. The paper also highlights the technical problems associated with the stochastic specification used in the KT-based multiple discrete-continuous models formulated in recent Environmental Economics papers.
Article
We present a Bayesian approach for analyzing aggregate level sales data in a market with differentiated products. We consider the aggregate share model proposed by Berry et al. [Berry, Steven, Levinsohn, James, Pakes, Ariel, 1995. Automobile prices in market equilibrium. Econometrica. 63 (4), 841–890], which introduces a common demand shock into an aggregated random coefficient logit model. A full likelihood approach is possible with a specification of the distribution of the common demand shock. We introduce a reparameterization of the covariance matrix to improve the performance of the random walk Metropolis for covariance parameters. We illustrate the usefulness of our approach with both actual and simulated data. Sampling experiments show that our approach performs well relative to the GMM estimator even in the presence of a mis-specified shock distribution. We view our approach as useful for those who are willing to trade off one additional distributional assumption for increased efficiency in estimation.
Article
We present a new prior and corresponding algorithm for Bayesian analysis of the multinomial probit model. Our new approach places a prior directly on the identified parameter space. The key is the specification of a prior on the covariance matrix so that the (1,1) element if fixed at 1 and it is possible to draw from the posterior using standard distributions. Analytical results are derived which can be used to aid in assessment of the prior. (C) 2000 Elsevier Science S.A. All rights reserved. JEL classification: C11; C25; C33; C35.
Article
Several consumer demand choices are characterized by the choice of multiple alternatives simultaneously. An example of such a choice situation in activity-travel analysis is the type of discretionary (or leisure) activity to participate in and the duration of time investment of the participation. In this context, within a given temporal period (say a day or a week), an individual may decide to participate in multiple types of activities (for example, in-home social activities, out-of-home social activities, in-home recreational activities, out-of-home recreational activities, and out-of-home non-maintenance shopping activities).
Article
We develop new methods for conducting a finite sample, likelihood-based analysis of the multinomial probit model. Using a variant of the Gibbs sampler, an algorithm is developed to draw from the exact posterior of the multinomial probit model with correlated errors. This approach avoids direct evaluation of the likelihood and, thus, avoids the problems associated with calculating choice probabilities which affect both the standard likelihood and method of simulated moments approaches. Both simulated and actual consumer panel data are used to fit six-dimensional choice models. We also develop methods for analyzing random coefficient and multiperiod probit models.
Article
A common theme in the marketing literature is the acquisition and retention of customers as they trade-up from inexpensive, introductory offerings to those of higher quality. Standard models of choice, however, apply to narrowly defined categories for which assumptions of near-perfect-substitution are valid. We extend the non-homothetic choice model of Allenby and Rossi (1991) to accommodate effects of advertising, professional recommendation and other factors that facilitate the description and management of trade-up. The model is applied to a national study of an over-the-counter health product.
Article
Incluye bibliografía e índice
Article
Incluye bibliografía e índice
Article
A random process called the Dirichlet process whose sample functions are almost surely probability measures has been proposed by Ferguson as an approach to analyzing nonparametric problems from a Bayesian viewpoint. An important result obtained by Ferguson in this approach is that if observations are made on a random variable whose distribution is a random sample function of a Dirichlet process, then the conditional distribution of the random measure can be easily calculated, and is again a Dirichlet process. This paper extends Ferguson's result to cases where the random measure is a mixing distribution for a parameter which determines the distribution from which observations are made. The conditional distribution of the random measure, given the observations, is no longer that of a simple Dirichlet process, but can be described as being a mixture of Dirichlet processes. This paper gives a formal definition for these mixtures and develops several theorems about their properties, the most important of which is a closure property for such mixtures. Formulas for computing the conditional distribution are derived and applications to problems in bio-assay, discrimination, regression, and mixing distributions are given.
Article
We review and extend results related to optimal scaling of Metropolis–Hastings algorithms. We present various theoretical results for the high-dimensional limit. We also present simulation studies which confirm the theoretical results in finite-dimensional contexts.
Article
Thesis (Ph. D.)--University of California, Los Angeles, 1969. Includes bibliographical references (leaves 55-56). Microfilm. s
Article
This paper develops techniques for empirically analyzing demand and supply in differentiated product markets and then applies these techniques to the U.S. automobile industry. The authors' framework enables one to obtain estimates of demand and cost parameters for a class of oligopolistic differentiated products markets. These estimates can be obtained using only widely available product-level and aggregate consumer-level data, and they are consistent with a structural model of equilibrium in an oligopolistic industry. Applying these techniques, the authors obtain parameters for essentially all autos sold over a twenty-year period. Copyright 1995 by The Econometric Society.
Article
To date, the most widely used method for empirical analysis of multiple alternative qualitative choices has been an extension of binary logit analysis called conditional logit analysis. Although this method is extremely attractive because of its computational simplicity, it is burdened with a property termed the "independence of irrelevant alternatives" that is quite unrealistic in many choice situations. We have proposed in this paper a computationally feasible method of estimation not constrained by the independence restriction and which allows for a much richer range of human behavior than does the conditional logit model. An important characteristic of the model is provision for correlation among the random components of utility and, as a by-product, the explicit allowance for variation in tastes across individuals for the attributes of alternatives. We have demonstrated the model and compared it with the logit one by analyzing the travel mode choice decisions of commuters to the central business district of Washington, D.C. Substantial differences are found in predictions based on the two models. The example allows three alternatives. Extension to four or five is quite feasible.
Article
An often-used scenario in marketing is that of individuals purchasing in a Poisson manner with their purchasing rates distributed gamma across the population of customers. Ehrenberg (1959) introduced the marketing community to this story and the resulting negative binomial distribution (NBD), and during the past 30 years the NBD model has been shown to work quite well. But the basic gamma/Poisson assumptions lack some face validity. In many product categories, customers purchase more regularly than the exponential. There are some individuals who will never purchase. The purpose of this article is to review briefly the literature that addresses these and other issues. The tractable results presented arise when the basic gamma/Poisson assumptions are relaxed one issue at a time. Some conjectures will be made about the robustness of the NBD when multiple deviations occur together. The NBD may work, but there are still opportunities for working on variations of the NBD theme.
Article
In data with a group structure, incidental parameters are included to control for missing variables. Applications include longitudinal data and sibling data. In general, the joint maximum likelihood estimator of the structural parameters is not consistent as the number of groups increases, with a fixed number of observations per group. Instead a conditional likelihood function is maximized, conditional on sufficient statistics for the incidental parameters. In the logit case, a standard conditional logit program can be used. Another solution is a random effects model, in which the distribution of the incidental parameters may depend upon the exogenous variables.