Feng Li

Feng Li
Peking University | PKU · "Guanghua" School of Management

PhD

About

54
Publications
33,448
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,101
Citations
Introduction
Dr. Feng Li’s research interests include Bayesian Statistics, Econometrics and Forecasting, and Distributed Learning. He develops highly scalable algorithms and software for solving real business problems. His recent research output appeared in top-tier journals like the Journal of Computational and Graphical Statistics, International Journal of Forecasting, Journal of Business and Economic Statistics, European Journal of Operational Research and Journal of the Operational Research Society.
Education
September 2008 - June 2013
Stockholm University
Field of study
  • Statistics
September 2003 - July 2007
Renmin University of China
Field of study
  • Statistics

Publications

Publications (54)
Article
Full-text available
A general model is proposed for flexibly estimating the density of a continuous response variable conditional on a possibly high-dimensional set of covariates. The model is a finite mixture of asymmetric student t densities with covariate-dependent mixture weights. The four parameters of the components, the mean, degrees of freedom, scale and skewn...
Article
Full-text available
The explosion of time series data in recent years has brought a flourish of new time series analysis methods, for forecasting, clustering, classification and other tasks. The evaluation of these new methods requires either collecting or simulating a diverse set of time series benchmarking data to enable reliable comparisons against alternative appr...
Article
Full-text available
In this work, we develop a distributed least squares approximation (DLSA) method that is able to solve a large family of regression problems (e.g., linear regression, logistic regression, and Cox’s model) on a distributed system. By approximating the local objective function using a local quadratic form, we are able to obtain a combined estimator b...
Article
Full-text available
Forecast combinations have been widely applied in the last few decades to improve forecasting. Estimating optimal weights that can outperform simple averages is not always an easy task. In recent years, the idea of using time series features for forecast combinations has flourished. Although this idea has been proved to be beneficial in several for...
Article
Full-text available
We examine the information asymmetry between local and nonlocal investors with a large dataset of stock message board postings. We document that abnormal relative postings of a firm, i.e., unusual changes in the volume of postings from local versus nonlocal investors, capture locals' information advantage. This measure positively predicts firms' sh...
Chapter
Full-text available
In economics and many other forecasting domains, the real world problems are too complex for a single model that assumes a specific data generation process. The forecasting performance of different methods changes depending on the nature of the time series. When forecasting large collections of time series, two lines of approaches have been develop...
Preprint
Full-text available
With the big popularity and success of Judea Pearl's original causality book, this review covers the main topics updated in the second edition in 2009 and illustrates an easy-to-follow causal inference strategy in a forecast scenario. It further discusses some potential benefits and challenges for causal inference with time series forecasting when...
Preprint
While forecast reconciliation has seen great success for real valued data, the method has not yet been comprehensively extended to the discrete case. This paper defines and develops a formal discrete forecast reconciliation framework based on optimising scoring rules using quadratic programming. The proposed framework produces coherent joint probab...
Preprint
Full-text available
In recent decades, new methods and approaches have been developed for forecasting intermittent demand series. However, the majority of research has focused on point forecasting, with little exploration into probabilistic intermittent demand forecasting. This is despite the fact that probabilistic forecasting is crucial for effective decision-making...
Article
Full-text available
Forecast combinations have flourished remarkably in the forecasting community and, in recent years, have become part of the mainstream of forecasting research and activities. Combining multiple forecasts produced from single (target) series is now widely used to improve accuracy through the integration of information gleaned from different sources,...
Article
The practical importance of coherent forecasts in hierarchical forecasting has inspired many studies on forecast reconciliation. Under this approach, base forecasts are produced for every series in the hierarchy and are subsequently adjusted to be coherent in a second reconciliation step. Reconciliation methods have been shown to improve forecast a...
Article
Full-text available
Intermittent demand forecasting is a ubiquitous and challenging problem in production systems and supply chain management. In recent years, there has been a growing focus on developing forecasting approaches for intermittent demand from academic and practical perspectives. However, limited attention has been given to forecast combination methods, w...
Preprint
Full-text available
Forecast combination is widely recognized as a preferred strategy over forecast selection due to its ability to mitigate the uncertainty associated with identifying a single "best" forecast. Nonetheless, sophisticated combinations are often empirically dominated by simple averaging, which is commonly attributed to the weight estimation error. The i...
Article
Full-text available
Forecasting has always been at the forefront of decision making and planning. The uncertainty that surrounds the future is both exciting and challenging, with individuals and organisations seeking to minimise risks and maximise utilities. The large number of forecasting applications calls for a diverse set of forecasting methods to tackle real-life...
Article
Full-text available
Forecasting has always been at the forefront of decision making and planning. The uncertainty that surrounds the future is both exciting and challenging, with individuals and organisations seeking to minimise risks and maximise utilities. The large number of forecasting applications calls for a diverse set of forecasting methods to tackle real-life...
Article
Full-text available
In this work, we propose a novel framework for density forecast combination by constructing time-varying weights based on time-varying features. Our framework estimates weights in the forecast combination via Bayesian log predictive scores, in which the optimal forecast combination is determined by time series features from historical information....
Article
Full-text available
Providing forecasts for ultra-long time series plays a vital role in various activities, such as investment decisions, industrial production arrangements, and farm management. This paper develops a novel distributed forecasting framework to tackle the challenges of forecasting ultra-long time series using the industry-standard MapReduce framework....
Preprint
Full-text available
Forecast combinations have flourished remarkably in the forecasting community and, in recent years, have become part of the mainstream of forecasting research and activities. Combining multiple forecasts produced from the single (target) series is now widely used to improve accuracy through the integration of information gleaned from different sour...
Preprint
Full-text available
The practical importance of coherent forecasts in hierarchical forecasting has inspired many studies on forecast reconciliation. Under this approach, so-called base forecasts are produced for every series in the hierarchy and are subsequently adjusted to be coherent in a second reconciliation step. Reconciliation methods have been shown to improve...
Preprint
Intermittent demand forecasting is a ubiquitous and challenging problem in production systems and supply chain management. In recent years, there has been a growing focus on developing forecasting approaches for intermittent demand from academic and practical perspectives. However, limited attention has been given to forecast combination methods, w...
Article
Hierarchical forecasting with intermittent time series is a challenge in both research and empirical studies. Extensive research focuses on improving the accuracy of each hierarchy, especially the intermittent time series at bottom levels. Then, hierarchical reconciliation can be used to improve the overall performance further. In this paper, we pr...
Article
Full-text available
Quantile regression is a method of fundamental importance. How to efficiently conduct quantile regression for a large dataset on a distributed system is of great importance. We show that the popularly used one-shot estimation is statistically inefficient if data are not randomly distributed across different workers. To fix the problem, a novel one-...
Preprint
Full-text available
In this work, we propose a novel framework for density forecast combination by constructing time-varying weights based on time series features, which is called Feature-based Bayesian Forecasting Model Averaging (FEBAMA). Our framework estimates weights in the forecast combination via Bayesian log predictive scores, in which the optimal forecasting...
Article
Full-text available
This paper introduces a novel meta-learning algorithm for time series forecast model performance prediction. We model the forecast error as a function of time series features calculated from historical time series with an efficient Bayesian multivariate surface regression approach. The minimum predicted forecast error is then used to identify an in...
Preprint
Full-text available
Hierarchical forecasting with intermittent time series is a challenge in both research and empirical studies. The overall forecasting performance is heavily affected by the forecasting accuracy of intermittent time series at bottom levels. In this paper, we present a forecasting reconciliation approach that treats the bottom level forecast as laten...
Article
Full-text available
Forecasting is an indispensable element of operational research (OR) and an important aid to planning. The accurate estimation of the forecast uncertainty facilitates several operations management activities, predominantly in supporting decisions in inventory and supply chain management and effectively setting safety stocks. In this paper, we intro...
Preprint
Full-text available
My contributions to this voluminous publication can be found on pp 38-40 "The natural law of growth in competition" and on pp 169-170 "Dealing with logistic forecasts in practice"
Preprint
Full-text available
Forecast combination has been widely applied in the last few decades to improve forecast accuracy. In recent years, the idea of using time series features to construct forecast combination model has flourished in the forecasting area. Although this idea has been proved to be beneficial in several forecast competitions such as the M3 and M4 competit...
Article
Background Rural counties in the United States have higher firearm suicide rates and opioid overdoses than urban counties. We sought to determine whether rural counties can be grouped based on these “diseases of despair.” Methods Age-adjusted firearm suicide death rates per 100,000; drug-related death rates per 100,000; homicide rate per 100,000,...
Article
Full-text available
Accurate forecasts are vital for supporting the decisions of modern companies. Forecasters typically select the most appropriate statistical model for each time series. However, statistical models usually presume some data generation process while making strong assumptions about the errors. In this paper, we present a novel data-centric approach-'f...
Article
Background: . Hospitalized self-inflicted firearm injuries have not been extensively studied, particularly regarding clinical diagnoses at the index admission. The objective of this study was to discover the diagnostic phenotypes (DPs) or clusters of hospitalized self-inflicted firearm injuries. Methods: . Using Nationwide Inpatient Sample data...
Preprint
Full-text available
Providing forecasts for ultra-long time series plays a vital role in various activities, such as investment decisions, industrial production arrangements, and farm management. This paper develops a novel distributed forecasting framework to tackle challenges associated with forecasting ultra-long time series by utilizing the industry-standard MapRe...
Chapter
Full-text available
This article considers a bilinear model that includes two different latent effects. The first effect has a direct influence on the response variable, whereas the second latent effect is assumed to first influence other latent variables, which in turn affect the response variable. In this article, latent variables are modelled via rank restrictions...
Article
Full-text available
Background Firearm-related death rates and years of potential life lost (YPLL) vary widely between population subgroups and states. However, changes or inflections in temporal trends within subgroups and states are not fully documented. We assessed temporal patterns and inflections in the rates of firearm deaths and %YPLL due to firearms for overal...
Preprint
Full-text available
Accurate forecasts are vital for supporting the decisions of modern companies. In order to improve statistical forecasting performance, forecasters typically select the most appropriate model for each data. However, statistical models presume a data generation process, while making strong distributional assumptions about the errors. In this paper,...
Preprint
Full-text available
This paper introduces a novel meta-learning algorithm for time series forecasting. The efficient Bayesian multivariate surface regression approach is used to model forecast error as a function of features calculated from the time series. The minimum predicted forecast error is then used to identify an individual model or combination of models to pr...
Preprint
Full-text available
In this work we develop a distributed least squares approximation (DLSA) method, which is able to solve a large family of regression problems (e.g., linear regression, logistic regression, Cox's model) on a distributed system. By approximating the local objective function using a local quadratic form, we are able to obtain a combined estimator by t...
Preprint
Full-text available
Interval forecasts have significant advantages in providing uncertainty estimation to point forecasts, leading to the importance of providing prediction intervals (PIs) as well as point forecasts. In this paper, we propose a general feature-based time series forecasting framework, which is divided into "offline" and "online" parts. In the "offline"...
Article
Full-text available
Understanding how defaults correlate across firms is a persistent concern in risk management. In this paper, we apply covariate-dependent copula models to assess the dynamic nature of credit risk dependence, which we define as “credit risk clustering”. We also study the driving forces of the credit risk clustering in CEC business group in China. Ou...
Preprint
Full-text available
Feature-based time series representation has attracted substantial attention in a wide range of time series analysis methods. Recently, the use of time series features for forecast model selection and model averaging has been an emerging research focus in the forecasting community. Nonetheless, most of the existing approaches depend on the manual c...
Preprint
Full-text available
The explosion of time series data in recent years has brought a flourish of new time series analysis methods, for forecasting, clustering, classification and other tasks. The evaluation of these new methods requires a diverse collection of time series benchmarking data to enable reliable comparisons against alternative approaches. We propose GeneRA...
Preprint
Full-text available
The explosion of time series data in recent years has brought a flourish of new time series analysis methods, for forecasting, clustering, classification and other tasks. The evaluation of these new methods requires a diverse collection of time series data to enable reliable comparisons against alternative approaches. We propose the use of mixture...
Article
Full-text available
Copulas provide an attractive approach to the construction of multivariate distributions with flexible marginal distributions and different forms of dependences. Of particular importance in many areas is the possibility of forecasting the tail-dependences explicitly. Most of the available approaches are only able to estimate tail-dependences and co...
Article
Full-text available
Purpose Globally, the age-standardised prevalence of type 2 diabetes mellitus (T2DM) has nearly doubled from 1980 to 2014, rising from 4.7% to 8.5% with an estimated 422 million adults living with the chronic disease. The MULTI sTUdy Diabetes rEsearch (MULTITUDE) consortium was recently established to harmonise data from 17 independent cohort studi...
Article
Full-text available
Understanding how defaults correlate across firms is a persistent concern in risk management. In this paper, we apply covariate-dependent copula models to assess the dynamics of credit risk dependence and its driving forces based on an empirical study of a business group in China. Our empirical analysis shows that the tail dependence of credit risk...
Article
Copulas provide an attractive approach for constructing multivariate densities with flexible marginal distributions and different forms of dependence. Of particular importance in many areas is the possibility of explicitly modeling tail-dependence. Most of the available approaches estimate tail-dependence and correlations via nuisance parameters, y...
Thesis
Full-text available
This thesis develops models and associated Bayesian inference methods for flexible univariate and multivariate conditional density estimation. The models are flexible in the sense that they can capture widely differing shapes of the data. The estimation methods are specifically designed to achieve flexibility while still avoiding overfitting. The m...
Article
Full-text available
Methods for choosing a fixed set of knot locations in additive spline models are fairly well established in the statistical literature. While most of these methods are in principle directly extendable to non-additive surface models, they are less likely to be successful in that setting because of the curse of dimensionality, especially when there a...
Chapter
Full-text available
IntroductionThe model and priorInference methodologyApplicationsConclusions AcknowledgementsAppendix: Implementation details for the gamma and log-normal modelsReferences
Article
Smooth mixtures, i.e. mixture models with covariate-dependent mixing weights, are very useful flexible models for conditional densities. Previous work shows that using too simple mixture components for modeling heteroscedastic and/or heavy tailed data can give a poor fit, even with a large number of components. This paper explores how well a smooth...

Network

Cited By