Table 3 - uploaded by Guoqi Qian
Content may be subject to copyright.
Example 3.2-QIC for model selection under normal mean and variance specification

Example 3.2-QIC for model selection under normal mean and variance specification

Source publication
Article
Full-text available
The Generalized Estimating Equations (GEE) method is one of the most commonly used statistical methods for the analysis of longitudinal data in epidemiological studies. A working correlation structure for the repeated measures of the outcome variable of a subject needs to be specified by this method. However, statistical criteria for selecting the...

Context in source publication

Context 1
... a normal mean and variance function for logFEV 1 with an identity link function, we calculate the QIC value for several correlation structures as shown in Table 3. The exchangeable structure is found to have the smallest QIC among all the full models, and is thus chosen as the preferred working covariance model. ...

Similar publications

Article
Full-text available
OBJECTIVE: the association between depressive symptoms (Center for Epidemiologic Studies Depression Scale [CES-D]) and subsequent cognitive function (Mini-Mental State Examination [MMSE]) is equivocal in literature. To examine the causal relationship between them, we use longitudinal data on MMSE and CESD and causal inference to illustrate the rela...

Citations

... For model comparison, we consider several information criteria, such as AIC, BIC, deviance information criteria (DIC) (Spiegelhalter et al. 2002), marginal DIC (DIC2) (Du et al. 2023), and correlation information criteria (CIC) (Cui and Qian 2007). ...
Article
Full-text available
Longitudinal studies have been conducted in various fields, including medicine, economics and the social sciences. In this paper, we focus on longitudinal ordinal data. Since the longitudinal data are collected over time, repeated outcomes within each subject may be serially correlated. To address both the within-subjects serial correlation and the specific variance between subjects, we propose a Bayesian cumulative probit random effects model for the analysis of longitudinal ordinal data. The hypersphere decomposition approach is employed to overcome the positive definiteness constraint and high-dimensionality of the correlation matrix. Additionally, we present a hybrid Gibbs/Metropolis-Hastings algorithm to efficiently generate cutoff points from truncated normal distributions, thereby expediting the convergence of the Markov Chain Monte Carlo (MCMC) algorithm. The performance and robustness of our proposed methodology under misspecified correlation matrices are demonstrated through simulation studies under complete data, missing completely at random (MCAR), and missing at random (MAR). We apply the proposed approach to analyze two sets of actual ordinal data: the arthritis dataset and the lung cancer dataset. To facilitate the implementation of our method, we have developed BayesRGMM, an open-source R package available on CRAN, accompanied by comprehensive documentation and source code accessible at https://github.com/kuojunglee/BayesRGMM/.
... kg/m 2 ), and obesity (> 30 kg/m 2 ). [29] C-Reactive Protein (CRP), triglycerides, and interleukin 6 (IL-6) were measured following a standard protocol. Plasma CRP levels were obtained from the enzyme-linked ultrasensitive immunosorbent method. ...
... GEE is a population average method developed based on the quasi-likelihood theory hence does not need one to specify the distribution of the response variable but only the mean 156 Nyabwanga and variance functions of the response observations (Carey and Wang , 2011). Further, as observed by Cui and Qian (2007), even with a mis-specified working correlation structure, GEE analysis still yields consistent regression coefficient estimators. Pan (2001) developed and championed for the routine use of Quasi-likelihood Information Criteria (QIC) for the selection of both the correlation matrix and best subset of explanatory variables. ...
... QIC has however been established to more often select a wrong correlation structure leading to less efficient GEE estimators to the extent of 40%. The finding on correlation matrix selection performance by Pan (2001) were supported by other findings which established that QIC was weak in picking out the true correlation matrix for repeated measurements (Nyabwanga et al., 2019a;Wang et al. , 2012;Cui and Qian , 2007). The wanting performance of QIC has over the years led to several modifications by scholars such as Hin and Wang, (2009) who developed the Correlation Information Critieria (CIC), Chen and Lazar (2012) who developed EAIC and EBIC among others in efforts to increase chances of selecting the correct matrix hence enhance efficiency of the estimates. ...
Article
Full-text available
The study proposes a two-step hybrid methodology for sparse generalized estimation equations modeling of the drivers of shareholder value creation. Through the methodology, the validity of the Gordon constant growth model is established and other non-dividend factors' contribution to shareholder value creation is assessed. The two-step hybrid method involves picking out the right intra-subject correlation matrix and set of regressors using EAIC and QIC respectively (EAIC-QIC) and then obtaining the penalized GEE estimators of the selected model. Penalization is useful in removing redundant regressors from the final model. The performance of the proposed method was compared to that of exclusively using QIC method in selecting both the correlation matrix and set of regressors. The study results showed that, whereas EAIC preferred the parsimonious order one auto-aggressive {AR(1)} structure for the data, QIC preferred the unstructured matrix which estimates the highest number of correlation parameters. Using the AR(1) structure and Algorithm 2, the GEE model chosen had higher efficiency compared to when QIC is used to select both the correlation matrix and regressors. Based on the results, the study concludes that adopting hybrid methods enhances efficiency of GEE estimators. On firm value, the study concludes that besides the elements in the Gordon-Constant growth model, the financial health of a firm is a vital indicator of value creation ability by firms.
... The QIC further decreases by 67 (from 9632 to 9565) when all variables and all moderation terms are included (compare models 2 and 6 in Table 2). Both decreases signal that the final model, that considers a TMT's structural attributes, underlying its ability to act as transformational leader to enable knowledge diversity for innovation, is the most parsimonious one (Cui and Qian, 2007). Table 1 provides descriptive statistics and correlations. ...
... (QIC) was analyzed to determine the appropriate correlation structure. The QIC allows for the selection of covariates and working correlation structure simultaneously (Cui & Qian, 2007;Pan, 2001). The analysis showed that the independent correlation matrix had a lower QIC value compared to the unstructured matrix; therefore, the GEE model was estimated with the former matrix. ...
Article
Purpose: This study addresses the complex task of determining the criminal intensity posed by serial killers in a murder series by introducing the Lambda (− rate of killings) to adjust for the time span in a murder series. It focuses on examining factors related to the offender and the crime-commission process that influence victim count in a series. Methods: Generalized estimating equations with a negative binomial and a gamma log link function were used to examine factors predicting victim count in a sample of 1258 serial murder cases. Results: Results showed that offender criminal history did not predict higher levels of Lambda when assessing victim count alone, but did predict a lower value when series length was accounted for. Killing methods were also significant predictors of a higher Lambda but were less useful when only number of victims was considered. Conclusions: Findings highlight the importance of the rate of killings along with total victim count for a more comprehensive understanding of the series' criminal intensity. This approach has implications for law enforcement and criminal profiling as it offers a more detailed perspective on the immediate threat posed by serial killers.
... which can be solved numerically using integrate() in R, for example. The QIC measures can be used to select both covariates and the best working correlation matrix, as suggested by Cui (2007) and Cui and Qian (2007). ...
Article
Experiments with repeated measures are the ones where more than one observation per subject is available. To model of such experiments, dependency within subjects needs to be taken into consideration. In cases where the variable of interest is bounded in (a, b) with a < b known reals, there are few proposals to model correlated bounded data most part being based on Beta, Simplex and Unit gamma distributions. In particular, for marginal modeling of the mean and precision/dispersion, Simplex and Beta models based on Generalized Estimating Equations (GEE) are used. In this paper, to take account of possible within-subject dependence using the GEE approach, we proposed an Unit Gamma regression model used to modeling bounded data in a unit interval. In this paper, we developed residuals and influence diagnostic tools to the Simplex and Unit Gamma models for correlated bounded data. Furthermore, To assess the finite-sample performance of the proposed estimators, we conducted a Monte Carlo simulation study. The methodology is illustrated with the analysis of a real data set. An R package was developed for all the new methodology described in this paper.
... This also means that the classical approach entrusts the entire correction to the empirical covariance matrix. GEE and FGLS add Steps 2 and 3 to help the correction be more efficient (Cui & Qian, 2007). These additional steps build some aspects of clustering into the standard errors in Step 3 so that the empirical covariance matrix in Step 4 does not bear the full brunt of the correction. ...
Article
Full-text available
Psychological data are often clustered within organizational units, which violates the independence assumption in standard regression models. Clustered errors, multilevel models, and fixed-effects models all address this issue, but in different ways. Disciplinary preferences for approaching clustered data are strong, which can restrict questions researchers ask because certain approaches are better equipped to handle particular types of questions. Resources comparing approaches to facilitate broader understanding of clustered data approaches exist for economists, political scientists, and biostatisticians. These existing resources use concepts and terminology consistent with statistical training in other disciplines, so this article provides a resource using language and principles familiar to psychologists. The article starts by walking through the origin and importance of the independence assumption to motivate the problem and emergence of different solutions in different fields. Then, information on clustered errors, multilevel models, and fixed-effect models is provided, including (a) how each approach addresses independence violations, (b) research questions ideally suited for each approach, and (c) example analyses highlighting advantages and disadvantages. The article then discusses how these approaches are not mutually exclusive but instead can be blended together to create tailor-made models that flexibly accommodate idiosyncrasies in research questions and are robust to nuances of a particular data set. The broader theme is that there is no one-size-fits-all approach to clustered data. The research question—not disciplinary preferences—should inform the statistical approach. Wider appreciation of the landscape of clustered data approaches can expand the questions researchers ask and improve the theoretical foundation of statistical models.
... Since GEE is not a likelihood-based method, statistics like AIC and BIC are not appropriate. The QIC statistic is appropriate for quasi-likelihood estimation which includes GEE (Cui & Qian, 2007;Pan, 2001). For each model analyzed in this study, the compound symmetry specification invariably produced the lowest QIC statistics among those available (independence, autoregressive, m-dependent) for the residual correlation matrix. ...
Article
Socioeconomic status (SES) is considered a powerful influence on children’s cognitive development and student achievement. This model has generated an enormous literature on the nature of, explanations for, and policy implications arising from SES inequalities in early childhood cognitive outcomes and student achievement. An alternative model focuses on the associations between SES and parental ability, the parent-child transmission of ability, and the association between children’s ability and their test scores. This study analyses two ability and three achievement measures, with composite and multiple SES measures and a commonly used indicator of the home environment (HOME) in children aged from 3 to 15. The associations between SES and children’s test scores are only partially accounted for by the home environment, which itself has only small to moderate associations with test scores, independent of SES. Adding mother’s cognitive ability substantially reduces the coefficients for the composite SES measure by between 50% and 60%, and for mother’s education by between 56% and 87%. The contemporaneous effects of SES and the home environment are small or very small. Sizable percentages of the variance in the five outcome measures are attributable to genetics ranging from 38% for the Peabody Picture Vocabulary Test (PPVT) to 77% for reading recognition. The contributions of the shared environment ranged from 14% for reading recognition to 41% for the PPVT. Therefore, genetics is important, and the non-trivial contributions of the common environment are more likely to reflect school and neighborhood factors rather than parental SES and the home environment.
... Given that each experimental container was observed daily throughout the study period, we used an autoregressive correlation structure to account for the temporal dependency between sequential daily observations. We chose an autoregressive correlation structure using QIC (quasi-likelihood under the independence model criterion) as it yielded best model fit (Appendix A) (Cui and Qian, 2007). The autoregressive correlation accounts for time-varying correlation by assuming that measurements taken closer in time are more highly correlated than measurements taken farther apart (Littell et al., 2000). ...
Article
Organisms living in environmentally stable ecosystems are hypothesized to exhibit narrow environmental tolerance ranges; however, previous experiments testing this prediction with invertebrates in spring habitats are equivocal. Here we examined the effects of elevated temperatures on four riffle beetle species (family: Elmidae) native to central and west Texas, USA. Two of these, Heterelmis comalensis and Heterelmis cf. glabra are known to occupy habitats immediately adjacent to spring openings and are thought to have stenothermal tolerance profiles. The other two, Heterelmis vulnerata and Microcylloepus pusillus are surface stream species with more cosmopolitan distributions and are assumed to be less sensitive to variation in environmental conditions. We examined performance and survival of elmids in response to increasing temperatures using dynamic and static assays. Additionally, changes in metabolic rate in response to thermal stress were assessed for all four species. Our results indicated that spring-associated H. comalensis is most sensitive while the more cosmopolitan elmid M. pusillus is least sensitive to thermal stress. However, there were differences in temperature tolerances among the two spring-associated species: H. comalensis had relatively narrow thermal tolerance in comparison to H. cf. glabra. This could be due to differences in the climatic and hydrological conditions in the geographical regions which the respective riffle beetle populations reside. However, despite these differences, H. comalensis and H. cf. glabra showed a dramatic increase in their metabolic rates with increasing temperatures indicating that these species are indeed spring specialists and likely have a stenothermal profile.
... In sample 1, we performed a set of lagged GEE models with a binomial distribution, logit link function, and exchangeable within-group correlation structure [29]. Robust standard errors were used to account for correlation between measures for each measure [29]. ...
... In sample 1, we performed a set of lagged GEE models with a binomial distribution, logit link function, and exchangeable within-group correlation structure [29]. Robust standard errors were used to account for correlation between measures for each measure [29]. We analyzed the lagged GEE models, in which homebound status at time t was related to falls at time t + 1 after adjusting for the falls at time t. ...
Article
Full-text available
Background Previous research has shown an association between homebound status and falls among older adults. However, this association was primarily drawn from cross-sectional studies. This study aimed to determine the bidirectional relationship between homebound status and falls among older adults in the community. Methods We used data of the community-dwelling older adults from 2011 to 2015 of the National Health and Aging Trends Study, a nationally representative survey of Medicare Beneficiaries in the United States (Sample 1 [No falls at baseline]: N = 2,512; Sample 2 [Non-homebound at baseline]: N = 2,916). Homebound status was determined by the frequency, difficulty, and needing help for outdoor mobility. Falls were ascertained by asking participants whether they had a fall in the last year. Generalized estimation equation models were used to examine the bidirectional association between homebound status and falls longitudinally. Results Participants with no falls at baseline (n = 2,512) were on average, 76.8 years old, non-Hispanic whites (70.1%), and female (57.1%). After adjusting for demographics and health-related variables, prior year homebound status significantly contributed to falls in the following year (Odds ratio [OR], 1.28, 95% CI: 1.09–1.51). Participants who were non-homebound at baseline (n = 2,916) were on average, 75.7 years old, non-Hispanic white (74.8%), and female (55.8%). Previous falls significantly predicted later homebound status (OR, 1.26, 95% CI: 1.10–1.45) in the full adjusted model. Conclusion This is the first longitudinal study to determine the bidirectional association between homebound status and falls. Homebound status and falls form a vicious circle and mutually reinforce each other over time. Our findings suggest the importance of developing programs and community activities that reduce falls and improve homebound status among older adults.