ArticlePDF Available

hglm: A Package for Fitting Hierarchical Generalized Linear Models

December 2010
The R Journal 2(2)

December 2010
2(2)

DOI:10.32614/RJ-2010-009

Authors:

Xia Shen

The University of Edinburgh

Alam Moudud

Dalarna University

We present the hglm package for ﬁtting hierarchical generalized linear models. It can be used for linear mixed models and generalized linear mixed models with random effects for a variety of links and a variety of distributions for both the outcomes and the random effects. Fixed effects can also be ﬁtted in the dispersion part of the model.

Commonly used distributions and link functions possible to fit with hglm()

…

Deviance diagnostics for each observation and each level in the random effect.

…

hglm code for commonly used models

…

Histogram and qqplot for the analyzed trait.

…

Figures - uploaded by Alam Moudud

Content may be subject to copyright.

Content uploaded by Alam Moudud

Content may be subject to copyright.

20 CONTRIBUTED RESEARCH ART ICLES

hglm: A Package for Fitting Hierarchical

Generalized Linear Models

by Lars Rönnegård, Xia Shen and Moudud Alam

Abstract We present the hglm package for ﬁt-

ting hierarchical generalized linear models. It

can be used for linear mixed models and gener-

alized linear mixed models with random effects

for a variety of links and a variety of distribu-

tions for both the outcomes and the random ef-

fects. Fixed effects can also be ﬁtted in the dis-

persion part of the model.

Introduction

The hglm package (Alam et al.,2010) implements

the estimation algorithm for hierarchical general-

ized linear models (HGLM; Lee and Nelder,1996).

The package ﬁts generalized linear models (GLM;

McCullagh and Nelder,1989) with random effects,

where the random effect may come from a distribu-

tion conjugate to one of the exponential-family dis-

tributions (normal, gamma, beta or inverse-gamma).

The user may explicitly specify the design matrices

both for the ﬁxed and random effects. In conse-

quence, correlated random effects, as well as random

regression models can be ﬁtted. The dispersion pa-

rameter can also be modeled with ﬁxed effects.

The main function is hglm() and the input is spec-

iﬁed in a similar manner as for glm(). For instance,

R> hglm(fixed = y ~ week, random = ~ 1|ID,

family = binomial(link = logit))

ﬁts a logit model for ywith week as ﬁxed effect and ID

representing the clusters for a normally distributed

random intercept. Given an hglm object, the stan-

dard generic functions are print(),summary() and

plot().

Generalized linear mixed models (GLMM) have

previously been implemented in several R functions,

such as the lmer() function in the lme4 package

(Bates and Maechler,2010) and the glmmPQL() func-

tion in the MASS package (Venables and Ripley,

2002). In GLMM, the random effects are assumed

to be Gaussian whereas the hglm() function allows

other distributions to be speciﬁed for the random

effect. The hglm() function also extends the ﬁtting

algorithm of the dglm package (Dunn and Smyth,

2009) by including random effects in the linear pre-

dictor for the mean, i.e. it extends the algorithm so

that it can cope with mixed models. Moreover, the

model speciﬁcation in hglm() can be given as a for-

mula or alternatively in terms of y,X,Zand X.disp.

Here yis the vector of observed responses, Xand

Zare the design matrices for the ﬁxed and random

effects, respectively, in the linear predictor for the

means and X.disp is the design matrix for the ﬁxed

effects in the dispersion parameter. This enables a

more ﬂexible modeling of the random effects than

specifying the model by an R formula. Consequently,

this option is not as user friendly but gives the user

the possibility to ﬁt random regression models and

random effects with known correlation structure.

The hglm package produces estimates of ﬁxed

effects, random effects and variance components as

well as their standard errors. In the output it also

produces diagnostics such as deviance components

and leverages.

Three illustrating models

The hglm package makes it possible to

1. include ﬁxed effects in a model for the residual

variance,

2. ﬁt models where the random effect distribution

is not necessarily Gaussian,

3. estimate variance components when we have

correlated random effects.

Below we describe three models that can be ﬁtted us-

ing hglm(), which illustrate these three points. Later,

in the Examples section, ﬁve examples are presented

that include the R syntax and output for the hglm()

function.

Linear mixed model with ﬁxed effects in

the residual variance

We start by considering a normal-normal model with

heteroscedastic residual variance. In biology, for in-

stance, this is important if we wish to model a ran-

dom genetic effect (e.g., Rönnegård and Carlborg,

2007) for a trait y, where the residual variance differs

between the sexes.

For the response yand observation number iwe

have:

yi|β,u,βd∼N(Xiβ+Ziu,exp(Xd,iβd))

u∼MVN0,Iσ2

u

where βare the ﬁxed effects in the mean part of the

model, the random effect urepresents random vari-

ation among clusters of observations and βdis the

ﬁxed effect in the residual variance part of the model.

The variance of the random effect uis given by σ2

The R Journal Vol. 2/2, December 2010 ISSN 2073-4859

CONTRIBUTED RESEARCH ARTI CLES 21

The subscript ifor the matrices X,Z, and Xdindi-

cates the i’th row. Here, a log link function is used

for the residual variance and the model for the resid-

ual variance is therefore given by exp(Xd,iβd). In

the more general GLM notation, the residual vari-

ance here is described by the dispersion term φ, so

we have log(φi) = Xd,iβd.

This model cannot be ﬁtted with the dglm pack-

age, for instance, because we have random effects in

the mean part of the model. It is also beyond the

scope of the lmer() function since we allow a model

for the residual variance.

The implementation in hglm() for this model is

demonstrated in Example 2 in the Examples section

below.

A Poisson model with gamma distributed

random effects

For dependent count data it is common to model

a Poisson distributed response with a gamma dis-

tributed random effect (Lee et al.,2006). If we assume

no overdispersion conditional on uand thereby have

a ﬁxed dispersion term, this model may be speciﬁed

as:

E(yi|β,u)=exp(Xiβ+Ziv)

where a level jin the random effect vis given by

vj=log(uj)and ujare iid with gamma distribution

having mean and variance: E(uj) = 1, var(uj) = λ.

This model can also be ﬁtted with the hglm pack-

age, since it extends existing GLMM functions (e.g.

lmer()) to allow a non-normal distribution for the

random effect. Later on, in Example 3, we show the

hglm() code used for ﬁtting a gamma-Poisson model

with ﬁxed effects included in the dispersion parame-

ter.

A linear mixed model with a correlated

random effect

In animal breeding it is important to estimate vari-

ance components prior to ranking of animal perfor-

mances (Lynch and Walsh,1998). In such models the

genetic effect of each animal is modeled as a level

in a random effect and the correlation structure Ais

a matrix with known elements calculated from the

pedigree information. The model is given by

yi|β,u∼NXiβ+Ziu,σ2

e

u∼MVN0,Aσ2

u

This may be reformulated as (see Lee et al.,2006;

Rönnegård and Carlborg,2007)

yi|β,u∼NXiβ+Z∗

iu∗,σ2

e

u∗∼MVN(0,Iσ2

where Z∗=ZL and Lis the Cholesky factorization of

Thus the model can be ﬁtted using the hglm()

function with a user-speciﬁed input matrix Z(see R

code in Example 4 below).

Overview of the ﬁtting algorithm

The ﬁtting algorithm is described in detail in Lee

et al. (2006) and is summarized as follows. Let nbe

the number of observations and kbe the number of

levels in the random effect. The algorithm is then:

1. Initialize starting values.

2. Construct an augmented model with response

yaug =y

E(u).

3. Use a GLM to estimate βand vgiven the vec-

tor φand the dispersion parameter for the ran-

dom effect λ. Save the deviance components

and leverages from the ﬁtted model.

4. Use a gamma GLM to estimate βdfrom the

ﬁrst ndeviance components dand leverages

hobtained from the previous model. The re-

sponse variable and weights for this model are

d/(1−h)and (1−h)/2, respectively. Update

the dispersion parameter by putting φequal to

the predicted response values for this model.

5. Use a similar GLM as in Step 4 to estimate λ

from the last kdeviance components and lever-

ages obtained from the GLM in Step 3.

6. Iterate between steps 3-5 until convergence.

For a more detailed description of the algorithm

in a particular context, see below.

H-likelihood theory

Let ybe the response and uan unobserved random

effect. The hglm package ﬁts a hierarchical model

y|u∼fm(µ,φ)and u∼fd(ψ,λ)where fmand fdare

speciﬁed distributions for the mean and dispersion

parts of the model.

We follow the notation of Lee and Nelder (1996),

which is based on the GLM terminology by McCul-

lagh and Nelder (1989). We also follow the likelihood

approach where the model is described in terms of

likelihoods. The conditional (log-)likelihood for y

given uhas the form of a GLM

`(θ0,φ;y|u) = yθ0−b(θ0)

a(φ)+c(y,φ)(1)

where θ0is the canonical parameter, φis the disper-

sion term, µ0is the conditional mean of ygiven u

The R Journal Vol. 2/2, December 2010 ISSN 2073-4859

22 CONTRIBUTED RESEARCH ART ICLES

where η0=g(µ0), i.e. g() is a link function for the

GLM. The linear predictor is given by η0=η+v

where η=Xβand v=v(u)for some strict monotonic

function of u. The link function v(u)should be spec-

iﬁed so that the random effects occur linearly in the

linear predictor to ensure meaningful inference from

the h-likelihood (Lee et al.,2007). The h-likelihood

or hierarchical likelihood is deﬁned by

h=`(θ0,φ;y|u) + `(α;v)(2)

where `(α;v)is the log density for vwith parameter

α. The estimates of βand vare given by ∂h

∂β =0 and

∂h

∂v=0. The dispersion components are estimated by

maximizing the adjusted proﬁle h-likelihood

hp=h−1

2log| − 1

2πH|β=ˆ

β,v=ˆ

(3)

where His the Hessian matrix of the h-likelihood.

The dispersion term φcan be connected to a lin-

ear predictor Xdβdgiven a link function gd(·)with

gd(φ) = Xdβd. The adjusted proﬁle likelihoods of `

and hmay be used for inference of β,vand the dis-

persion parameters φand λ(pp. 186 in Lee et al.,

2006). More detail and discussion of h-likelihood

theory is presented in the hglm vignette.

Detailed description of the hglm ﬁtting al-

gorithm for a linear mixed model with het-

eroscedastic residual variance

In this section we describe the ﬁtting algorithm in de-

tail for a linear mixed model where ﬁxed effects are

included in the model for the residual variance. The

extension to distributions other than Gaussian is de-

scribed at the end of the section.

Lee and Nelder (1996) showed that linear mixed

models can be ﬁtted using a hierarchy of GLM by

using an augmented linear model. The linear mixed

model

y=Xb+Zu+e

v=ZZTσ2

u+Rσ2

where Ris a diagonal matrix with elements given

by the estimated dispersion model (i.e. φdeﬁned be-

low). In the ﬁrst iteration of the HGLM algorithm, R

is an identity matrix. The model may be written as

an augmented weighted linear model:

ya=Taδ+ea(4)

where

ya=y

0qTa=X Z

0 Iq

δ=b

uea=e

−u

Here, qis the number of columns in Z, 0qis a vec-

tor of zeros of length q, and Iqis the identity matrix

of size q×q. The variance-covariance matrix of the

augmented residual vector is given by

V(ea) = Rσ2

0 Iqσ2

u

Given σ2

eand σ2

u, this weighted linear model gives

the same estimates of the ﬁxed and random effects

(band urespectively) as Henderson’s mixed model

equations (Henderson,1976).

The estimates from weighted least squares are

given by:

aW−1Taˆ

δ=Tt

aW−1ya

where W≡V(ea).

The two variance components are estimated iter-

atively by applying a gamma GLM to the residuals

iand u2

iwith intercept terms included in the linear

predictors. The leverages hifor these models are cal-

culated from the diagonal elements of the hat matrix:

Ha=Ta(Tt

aW−1Ta)−1Tt

aW−1(5)

A gamma GLM is used to ﬁt the dispersion part of

the model with response

yd,i=e2

i/(1−hi)(6)

where E(yd) = µdand µd≡φ(i.e. σ2

efor a Gaussian

reponse). The GLM model for the dispersion pa-

rameter is then speciﬁed by the link function gd(.)

and the linear predictor Xdβd, with prior weights

(1−hi)/2, for

gd(µd) = Xdβd(7)

Similarly, a gamma GLM is ﬁtted to the dispersion

term α(i.e. σ2

ufor a GLMM) for the random effect v,

with

yα,j=u2

j/(1−hn+j),j=1,2,...,q(8)

and

gα(µα) = λ(9)

where the prior weights are (1−hn+j)/2 and the esti-

mated dispersion term for the random effect is given

by ˆ

α=g−1

α(ˆ

λ).

The algorithm iterates by updating both R=

diag(ˆ

φ)and σ2

u=ˆ

α, and subsequently going back to

Eq. (4).

For a non-Gaussian response variable y, the esti-

mates are obtained simply by ﬁtting a GLM instead

of Eq. (4) and by replacing e2

iand u2

jwith the de-

viance components from the augmented model (see

Lee et al.,2006).

The R Journal Vol. 2/2, December 2010 ISSN 2073-4859

CONTRIBUTED RESEARCH ARTI CLES 23

Implementation details

Distributions and link functions

There are two important classes of models that can

be ﬁtted in hglm: GLMM and conjugate HGLM.

GLMMs have Gaussian random effects. Conjugate

HGLMs have been commonly used partly due to the

fact that explicit formulas for the marginal likelihood

exist. HGLMs may be used to ﬁt models in sur-

vival analysis (frailty models), where for instance the

complementary-log-log link function can be used on

binary responses (see e.g., Carling et al.,2004). The

gamma distribution plays an important role in mod-

eling responses with a constant coefﬁcient of varia-

tion (see Chapter 8 in McCullagh and Nelder,1989).

For such responses with a gamma distributed ran-

dom effect we have a gamma-gamma model. A sum-

mary of the most important models is given in Tables

1and 2. Note that the random-effect distribution can

be an arbitrary conjugate exponential-family distri-

bution. For the speciﬁc case where the random-effect

distribution is a conjugate to the distribution of y,

this is called a conjugate HGLM. Further implemen-

tation details can be found in the hglm vignette.

Possible future developments

In the current version of hglm() it is possible to in-

clude a single random effect in the mean part of the

model. An important development would be to in-

clude several random effects in the mean part of the

model and also to include random effects in the dis-

persion parts of the model. The latter class of models

is called Double HGLM and has been shown to be

a useful tool for modeling heavy tailed distributions

(Lee and Nelder,2006).

The algorithm of hglm() gives true marginal like-

lihood estimates for the ﬁxed effects in conjugate

HGLM (Lee and Nelder,1996, pp. 629), whereas

for other models the estimates are approximated.

Lee and co-workers (see Lee et al.,2006, and refer-

ences therein) have developed higher-order approx-

imations, which are not implemented in the current

version of the hglm package. For such extensions,

we refer to the commercially available GenStat soft-

ware (Payne et al.,2007), the recently available R

package HGLMMM (Molas,2010) and also to com-

ing updates of hglm.

Examples

Example 1: A linear mixed model

Data description The output from the hglm() func-

tion for a linear mixed model is compared to the re-

sults from the lme() function in the nlme (Pinheiro

et al.,2009) package using simulated data. In the sim-

ulated data there are ﬁve clusters with 20 observa-

tions in each cluster. For the mean part of the model,

the simulated intercept value is µ=0, the variance

for the random effect is σ2

u=0.2, and the residual

variance is σ2

e=1.0 .

Both functions produce the same estimate of

the ﬁxed intercept effect of 0.1473 (s.e. 0.16)

and also the same variance component estimates.

The summary.hglm() function gives the estimate

of the variance component for the random in-

tercept (0.082) as well as the residual variance

(0.84). It also gives the logarithm of the vari-

ance component estimates together with standard

errors below the lines Model estimates for the

dispersion term and Dispersion model for the

random effects. The lme() function gives the

square root of the variance component estimates.

The model diagnostics produced by the

plot.hglm function are shown in Figures 1and 2.

The data are completely balanced and therefore pro-

duce equal leverages (hatvalues) for all observations

and also for all random effects (Figure 1). Moreover,

the assumption of the deviance components being

gamma distributed is acceptable (Figure 2).

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●

0 20 40 60 80 100

0.1 0.2 0.3 0.4

Index

hatvalues

Figure 1: Hatvalues (i.e. diagonal elements of the

augmented hat-matrix) for each observation 1 to 100,

and for each level in the random effect (index 101-

105).

The R Journal Vol. 2/2, December 2010 ISSN 2073-4859

24 CONTRIBUTED RESEARCH ART ICLES

Table 1: Commonly used distributions and link functions possible to ﬁt with hglm()

Model name y|udistribution Link g(µ)udistribution Link v(u)

Linear mixed model Gaussian identity Gaussian identity

Binomial conjugate Binomial logit Beta logit

Binomial GLMM Binomial logit Gaussian identity

Binomial frailty Binomial comp-log-log Gamma log

Poisson GLMM Poisson log Gaussian identity

Poisson conjugate Poisson log Gamma log

Gamma GLMM Gamma log Gaussian identity

Gamma conjugate Gamma inverse Inverse-Gamma inverse

Gamma-Gamma Gamma log Gamma log

Table 2: hglm code for commonly used models

Model name Setting for family argument Setting for rand.family argument

Linear mixed modelagaussian(link = identity) gaussian(link = identity)

Beta-Binomial binomial(link = logit) Beta(link = logit)

Binomial GLMM binomial(link = logit) gaussian(link = identity)

Binomial frailty binomial(link = cloglog) Gamma(link = log)

Poisson GLMM poisson(link = log) gaussian(link = identity)

Poisson frailty poisson(link = log) Gamma(link = log)

Gamma GLMM Gamma(link = log) gaussian(link = identity)

Gamma conjugate Gamma(link = inverse) inverse.gamma(link = inverse)

Gamma-Gamma Gamma(link = log) Gamma(link = log)

aFor example, the hglm() code for a linear mixed model is

hglm(family = gaussian(link = identity), rand.family = gaussian(link = identity), ...)

●

0 20 40 60 80 100

012345

Index

Deviances

●●

●

●●●

●

●●

●

●●

●

●●●●

●●●●●

●

0 1 2 3 4 5

012345

Gamma Quantiles

Deviance Quantiles

Figure 2: Deviance diagnostics for each observation

and each level in the random effect.

The R code and output for this example is as fol-

lows:

R> set.seed(123)

R> n.clus <- 5 #No. of clusters

R> n.per.clus <- 20 #No. of obs. per cluster

R> sigma2_u <- 0.2 #Variance of random effect

R> sigma2_e <- 1 #Residual variance

R> n <- n.clus*n.per.clus

R> X <- matrix(1, n, 1)

R> Z <- diag(n.clus)%x%rep(1, n.per.clus)

R> a <- rnorm(n.clus, 0, sqrt(sigma2_u))

R> e <- rnorm(n, 0, sqrt(sigma2_e))

R> mu <- 0

R> y <- mu + Z%*%a + e

R> lmm <- hglm(y = y, X = X, Z = Z)

R> summary(lmm)

R> plot(lmm)

Call:

hglm.default(X = X, y = y, Z = Z)

DISPERSION MODEL

WARNING: h-likelihood estimates through EQL can be biased.

Model estimates for the dispersion term:[1] 0.8400608

Model estimates for the dispersion term:

Link = log

Effects:

Estimate Std. Error

-0.1743 0.1441

Dispersion = 1 is used in Gamma model on deviances

to calculate the standard error(s).

Dispersion parameter for the random effects

[1] 0.08211

Dispersion model for the random effects:

Link = log

Effects:

Estimate Std. Error

-2.4997 0.8682

Dispersion = 1 is used in Gamma model on deviances

to calculate the standard error(s).

MEAN MODEL

Summary of the fixed effects estimates

Estimate Std. Error t value Pr(>|t|)

X.1 0.1473 0.1580 0.933 0.353

Note: P-values are based on 96 degrees of freedom

Summary of the random effects estimate

Estimate Std. Error

[1,] -0.3237 0.1971

[2,] -0.0383 0.1971

The R Journal Vol. 2/2, December 2010 ISSN 2073-4859

CONTRIBUTED RESEARCH ARTI CLES 25

[3,] 0.3108 0.1971

[4,] -0.0572 0.1971

[5,] 0.1084 0.1971

EQL estimation converged in 5 iterations.

R> #Same analysis with the lme function

R> library(nlme)

R> clus <- rep(1:n.clus,

+ rep(n.per.clus, n.clus))

R> summary(lme(y ~ 0 + X,

+ random = ~ 1 | clus))

Linear mixed-effects model fit by REML

Data: NULL

AIC BIC logLik

278.635 286.4203 -136.3175

Random effects:

Formula: ~1 | clus

(Intercept) Residual

StdDev: 0.2859608 0.9166

Fixed effects: y ~ 0 + X

Value Std.Error DF t-value p-value

X 0.1473009 0.1573412 95 0.9361873 0.3516

Standardized Within-Group Residuals:

Min Q1 Med Q3 Max

-2.5834807 -0.6570612 0.0270673 0.6677986 2.1724148

Number of Observations: 100

Number of Groups: 5

Example 2: Analysis of simulated data for

a linear mixed model with heteroscedastic

residual variance

Data description Here, a heteroscedastic residual

variance is added to the simulated data from the pre-

vious example. Given the explanatory variable xd,

the simulated residual variance is 1.0 for xd=0 and

2.72 for xd=1. The output shows that the vari-

ance of the random effect is 0.109, and that ˆ

βd=

(−0.32,1.47), i.e. the two residual variances are es-

timated as 0.72 and 3.16. (Code continued from Ex-

ample 1)

R> beta.disp <- 1

R> X_d <- matrix(1, n, 2)

R> X_d[,2] <- rbinom(n, 1, .5)

R> colnames(X_d) <- c("Intercept", "x_d")

R> e <- rnorm(n, 0,

+ sqrt(sigma2_e*exp(beta.disp*X_d[,2])))

R> y <- mu + Z%*%a + e

R> summary(hglm(y = y, X = X, Z = Z,

+ X.disp = X_d))

Call:

hglm.default(X = X, y = y, Z = Z, X.disp = X_d)

DISPERSION MODEL

WARNING: h-likelihood estimates through EQL can be biased.

Model estimates for the dispersion term:

Link = log

Effects:

Estimate Std. Error

Intercept -0.3225 0.2040

x_d 1.4744 0.2881

Dispersion = 1 is used in Gamma model on deviances

to calculate the standard error(s).

Dispersion parameter for the random effects

[1] 0.1093

Dispersion model for the random effects:

Link = log

Effects:

Estimate Std. Error

-2.2135 0.8747

Dispersion = 1 is used in Gamma model on deviances

to calculate the standard error(s).

MEAN MODEL

Summary of the fixed effects estimates

Estimate Std. Error t value Pr(>|t|)

X.1 -0.0535 0.1836 -0.291 0.771

Note: P-values are based on 96 degrees of freedom

Summary of the random effects estimate

Estimate Std. Error

[1,] 0.0498 0.2341

[2,] -0.2223 0.2276

[3,] 0.4404 0.2276

[4,] -0.1786 0.2276

[5,] -0.0893 0.2296

EQL estimation converged in 5 iterations.

Example 3: Fitting a Poisson model with

gamma random effects, and ﬁxed effects in

the dispersion term

Data description We simulate a Poisson model

with random effects and estimate the parameter in

the dispersion term for an explanatory variable xd.

The estimated dispersion parameter for the random

effects is 0.6556. (Code continued from Example 2)

R> u <- rgamma(n.clus,1)

R> eta <- exp(mu + Z%*%u)

R> y <- rpois(length(eta), eta)

R> gamma.pois <- hglm(y = y, X = X, Z = Z,

+ X.disp = X_d,

+ family = poisson(

+ link = log),

+ rand.family =

+ Gamma(link = log))

R> summary(gamma.pois)

Call:

hglm.default(X = X, y = y, Z = Z,

family = poisson(link = log),

rand.family = Gamma(link = log), X.disp = X_d)

DISPERSION MODEL

WARNING: h-likelihood estimates through EQL can be biased.

Model estimates for the dispersion term:

Link = log

Effects:

Estimate Std. Error

Intercept -0.0186 0.2042

x_d 0.4087 0.2902

Dispersion = 1 is used in Gamma model on deviances

to calculate the standard error(s).

Dispersion parameter for the random effects

The R Journal Vol. 2/2, December 2010 ISSN 2073-4859

26 CONTRIBUTED RESEARCH ART ICLES

[1] 1.926

Dispersion model for the random effects:

Link = log

Effects:

Estimate Std. Error

0.6556 0.7081

Dispersion = 1 is used in Gamma model on deviances

to calculate the standard error(s).

MEAN MODEL

Summary of the fixed effects estimates

Estimate Std. Error t value Pr(>|t|)

X.1 2.3363 0.6213 3.76 0.000293

---

Note: P-values are based on 95 degrees of freedom

Summary of the random effects estimate

Estimate Std. Error

[1,] 1.1443 0.6209

[2,] -1.6482 0.6425

[3,] -2.5183 0.6713

[4,] -1.0243 0.6319

[5,] 0.2052 0.6232

EQL estimation converged in 3 iterations.

Example 4: Incorporating correlated ran-

dom effects in a linear mixed model - a ge-

netics example

Data description The data consists of 2025 indi-

viduals from two generations where 1000 individ-

uals have observed trait values ythat are approxi-

mately normal (Figure 3). The data we analyze was

simulated for the QTLMAS 2009 Workshop (Coster

et al.,2010)1. A longitudinal growth trait was sim-

ulated. For simplicity we analyze only the val-

ues given on the third occasion at age 265 days.

Frequency

2 4 6 8 10 14

0 50 100 150 200

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●●

●

●●●

●

●●

●

●●

●

●●

●

−3 −1 0 1 2 3

2 4 6 8 10 12

Theoretical Quantiles

Sample Quantiles

Figure 3: Histogram and qqplot for the analyzed

trait.

We ﬁtted a model with a ﬁxed intercept and a

random animal effect, a, where the correlation struc-

ture of ais given by the additive relationhip matrix

A(which is obtained from the available pedigree in-

formation). An incidence matrix Z0was constructed

and relates observation number with id-number in

the pedigree. For observation yicoming from indi-

vidual jin the ordered pedigree ﬁle Z0[i,j] = 1, and

all other elements are 0. Let Lbe the Cholesky factor-

ization of A, and Z=Z0L. The design matrix for the

ﬁxed effects, X, is a column of ones. The estimated

variance components are ˆ

σ2

e=2.21 and ˆ

σ2

u=1.50.

The R code for this example is given below.

R> data(QTLMAS)

R> y <- QTLMAS[,1]

R> Z <- QTLMAS[,2:2026]

R> X <- matrix(1, 1000, 1)

R> animal.model <- hglm(y = y, X = X, Z = Z)

R> print(animal.model)

Call:

hglm.default(X = X, y = y, Z = Z)

Fixed effects:

X.1

7.279766

Random effects:

[1] -1.191733707 1.648604776 1.319427376 -0.928258503

[5] -0.471083317 -1.058333534 1.011451565 1.879641994

[9] 0.611705900 -0.259125073 -1.426788944 -0.005165978

...

Dispersion parameter for the mean model:[1] 2.211169

Dispersion parameter for the random effects:[1] 1.502516

EQL estimation converged in 2 iterations

Example 5: Binomial-beta model applied

to seed germination data

Data description The seed germination data pre-

sented by Crowder (1978) has previously been ana-

lyzed using a binomial GLMM (Breslow and Clay-

ton,1993) and a binomial-beta HGLM (Lee and

Nelder,1996). The data consists of 831 observations

from 21 germination plates. The effect of seed vari-

ety and type of root extract was studied in a 2 ×2

factorial lay-out. We ﬁt the binomial-beta HGLM

used by Lee and Nelder (1996) and setting fix.disp

= 1 in hglm() produces comparable estimates to the

ones obtained by Lee and Nelder (with differences

<2×10−3). The beta distribution parameter αin Lee

and Nelder (1996) was deﬁned as 1/(2a)where ais

the dispersion term obtained from hglm(). The out-

put from the R code given below gives ˆ

a=0.0248 and

the corresponding estimate given in Lee and Nelder

(1996) is ˆ

a=1/(2ˆ

α) = 0.023. We conclude that the

hglm package produces similar results as the ones

presented in Lee and Nelder (1996) and the disper-

sion parameters estimated using the EQL method in

GenStat differ by less than 1%. Additional examples,

together with comparisons to estimates produced by

GenStat, are given in the hglm vignette included in

the package on CRAN.

R> data(seeds)

R> germ <- hglm(

+ fixed = r/n ~ extract*I(seed=="O73"),

1http://www.qtlmas2009.wur.nl/UK/Dataset

The R Journal Vol. 2/2, December 2010 ISSN 2073-4859

CONTRIBUTED RESEARCH ARTI CLES 27

+ weights = n, data = seeds,

+ random = ~1|plate, family = binomial(),

+ rand.family = Beta(), fix.disp = 1)

R> summary(germ)

Call:

hglm.formula(family = binomial(), rand.family = Beta(),

fixed = r/n ~ extract * I(seed == "O73"),

random = ~1 | plate, data = seeds,

weights = n, fix.disp = 1)

DISPERSION MODEL

WARNING: h-likelihood estimates through EQL can be biased.

Model estimates for the dispersion term:[1] 1

Model estimates for the dispersion term:

Link = log

Effects:

[1] 1

Dispersion = 1 is used in Gamma model on deviances to

calculate the standard error(s).

Dispersion parameter for the random effects

[1] 0.02483

Dispersion model for the random effects:

Link = log

Effects:

Estimate Std. Error

-3.6956 0.5304

Dispersion = 1 is used in Gamma model on deviances to

calculate the standard error(s).

MEAN MODEL

Summary of the fixed effects estimates

Estimate Std. Error t value

(Intercept) -0.5421 0.1928 -2.811

extractCucumber 1.3386 0.2733 4.898

I(seed == "O73")TRUE 0.0751 0.3114 0.241

extractCucumber:I(seed=="O73") -0.8257 0.4341 -1.902

Pr(>|t|)

(Intercept) 0.018429

extractCucumber 0.000625

I(seed == "O73")TRUE 0.814264

extractCucumber:I(seed=="O73") 0.086343

---

Note: P-values are based on 10 degrees of freedom

Summary of the random effects estimate

Estimate Std. Error

[1,] -0.2333 0.2510

[2,] 0.0085 0.2328

...

[21,] -0.0499 0.2953

EQL estimation converged in 7 iterations.

Summary

The hierarchical generalized linear model approach

offers new possibilities to ﬁt generalized linear mod-

els with random effects. The hglm package extends

existing GLMM ﬁtting algorithms to include ﬁxed ef-

fects in a model for the residual variance, ﬁts mod-

els where the random effect distribution is not neces-

sarily Gaussian and estimates variance components

for correlated random effects. For such models there

are important applications in, for instance: genet-

ics (Noh et al.,2006), survival analysis (Ha and Lee,

2005), credit risk modeling (Alam and Carling,2008),

count data (Lee et al.,2006) and dichotomous re-

sponses (Noh and Lee,2007). We therefore expect

that this new package will be of use for applied statis-

ticians in several different ﬁelds.

Bibliography

M. Alam and K. Carling. Computationally feasible

estimation of the covariance structure in general-

ized linear mixed models GLMM. Journal of Sta-

tistical Computation and Simulation, 78:1227–1237,

2008.

M. Alam, L. Ronnegard, and X. Shen. hglm: Hierar-

chical Generalized Linear Models, 2010. URL http:

//CRAN.R-project.org/package=hglm. R package

version 1.1.1.

D. Bates and M. Maechler. lme4: Linear mixed-effects

models using S4 classes, 2010. URL http://CRAN.

R-project.org/package=lme4. R package version

0.999375-37.

N. E. Breslow and D. G. Clayton. Approximate infer-

ence in generalized linear mixed models. Journal of

the American Statistical Association, 88:9–25, 1993.

K. Carling, L. Rönnegård, and K. Roszbach. An

analysis of portfolio credit risk when counterpar-

ties are interdependent within industries. Sveriges

Riksbank Working Paper, 168, 2004.

A. Coster, J. Bastiaansen, M. Calus, C. Maliepaard,

and M. Bink. QTLMAS 2010: Simulated dataset.

BMC Proceedings, 4(Suppl 1):S3, 2010.

M. J. Crowder. Beta-binomial ANOVA for propor-

tions. Applied Statistics, 27:34–37, 1978.

P. K. Dunn and G. K. Smyth. dglm: Double generalized

linear models, 2009. URL http://CRAN.R-project.

org/package=dglm. R package version 1.6.1.

I. D. Ha and Y. Lee. Comparison of hierarchical likeli-

hood versus orthodox best linear unbiased predic-

tor approaches for frailty models. Biometrika, 92:

717–723, 2005.

C. R. Henderson. A simple method for comput-

ing the inverse of a numerator relationship matrix

used in prediction of breeding values. Biometrics,

32(1):69–83, 1976.

Y. Lee and J. A. Nelder. Double hierarchical general-

ized linear models with discussion. Applied Statis-

tics, 55:139–185, 2006.

Y. Lee and J. A. Nelder. Hierarchical generalized lin-

ear models with discussion. J. R. Statist. Soc. B, 58:

619–678, 1996.

The R Journal Vol. 2/2, December 2010 ISSN 2073-4859

28 CONTRIBUTED RESEARCH ART ICLES

Y. Lee, J. A. Nelder, and Y. Pawitan. Generalized linear

models with random effects. Chapman & Hall/CRC,

2006.

Y. Lee, J. A. Nelder, and M. Noh. H-likelihood: prob-

lems and solutions. Statistics and Computing, 17:

49–55, 2007.

M. Lynch and B. Walsh. Genetics and analysis of Quan-

titative Traits. Sinauer Associates, Inc., 1998. ISBN

087893481.

P. McCullagh and J. A. Nelder. Generalized linear mod-

els. Chapman & Hall/CRC, 1989.

M. Molas. HGLMMM: Hierarchical Generalized Linear

Models, 2010. URL http://CRAN.R- project.org/

package=HGLMMM. R package version 0.1.1.

M. Noh and Y. Lee. REML estimation for binary data

in GLMMs. Journal of Multivariate Analysis, 98:896–

915, 2007.

M. Noh, B. Yip, Y. Lee, and Y. Pawitan. Multicompo-

nent variance estimation for binary traits in family-

based studies. Genetic Epidemiology, 30:37–47, 2006.

R. W. Payne, D. A. Murray, S. A. Harding, D. B. Baird,

and D. M. Soutar. GenStat for Windows (10th edi-

tion) introduction, 2007. URL http://www.vsni.

co.uk/software/genstat.

J. Pinheiro, D. Bates, S. DebRoy, D. Sarkar, and the

R Core team. nlme: Linear and Nonlinear Mixed Ef-

fects Models, 2009. URL http://CRAN.R- project.

org/package=nlme. R package version 3.1-96.

L. Rönnegård and Ö. Carlborg. Separation of base al-

lele and sampling term effects gives new insights

in variance component QTL analysis. BMC Genet-

ics, 8(1), 2007.

W. N. Venables and B. D. Ripley. Modern Applied

Statistics with S. Springer, New York, fourth edi-

tion, 2002. URL http://www.stats.ox.ac.uk/

pub/MASS4. ISBN 0-387-95457-0.

Lars Rönnegård

Statistics Unit

Dalarna University, Sweden

and

Department of Animal Breeding and Genetics

Swedish University of Agricultural Sciences, Sweden

lrn@du.se

Xia Shen

Department of Cell and Molecular Biology

Uppsala University, Sweden

and

Statistics Unit

Dalarna University, Sweden

xia.shen@lcb.uu.se

Moudud Alam

Statistics Unit

Dalarna University, Sweden

maa@du.se

The R Journal Vol. 2/2, December 2010 ISSN 2073-4859

Zoogeographical regions in the Atlantic Forest: patterns and potential drivers

Article

May 2024

Zoogeographical regions in the Atlantic Forest: patterns and potential drivers

Article

May 2024

Aim To delineate present‐day zoogeographical regions of terrestrial vertebrates (frogs, lizards, snakes, birds and non‐volant mammals) in the Atlantic Forest. Within each taxonomic group, we examine the relative importance of abrupt climatic transitions, orographic barriers, past climate change and rivers in shaping zoogeographical boundaries. Location South America's Atlantic Forest. Methods We applied a network‐based method to delineate zoogeographical regions, using distribution data (range maps) for 455 species of frogs, 103 lizards, 220 snakes, 917 birds and 202 non‐volant mammals, in 50 × 50 km grid cells. We used hierarchical generalized linear mixed‐effects models to test environmental predictors associated with zoogeographical boundaries. Finally, we intersected the bioregion maps delineated for each group to identify general patterns across all vertebrates. Results We identified four zoogeographical regions for birds and snakes, and five for frogs, lizards and non‐volant mammals. Depending on the group, contemporary and past climate conditions, elevation variation and/or rivers were associated with zoogeographical boundaries. The combined maps indicate that the Atlantic Forest retains four spatially cohesive zoogeographical regions based on present‐day distribution of vertebrates. Main conclusions Cross‐taxon congruence indicates that the geographical and environmental characteristics of the Atlantic Forest have a strong influence on the location of zoogeographical regions for vertebrates. In contrast, transition zones appear to be associated with the spatial distribution of life history traits of each group, potentially explaining the observed differences in the number of bioregions across groups and the position of zoogeographical boundaries. This work paves the way for further research into the evolutionary assembly of the Atlantic Forest's zoogeographical regions and may help inform conservation priorities for maintaining their distinctive faunas.

Assessment of the impact of phenotypic traits and the festival of Eid Al-Adha on the prices of small ruminants in the livestock market "Sougr-Nooma" in Ouagadougou

Article

Feb 2024

This study aimed to assess the impact of phenotypic parameters and festive events on the price of small ruminants in Burkina Faso. A semi-directive survey using a quiz was conducted among sheep customers at the "Sougr-Nooma" livestock market in Tanghin, Ouagadougou. Data collection focused on Aid El Kebir and lasted for three weeks. The collected data included the prices and some physical characteristics (both quantitative and qualitative) of the animals. The survey covered 120 sheep sold on this market along with 120 buyers. Live weight was the most important quantitative phenotypic parameter in determining the price of sheep (r=0.91). For every kilogram increase in live weight, the selling price increased by 2703.71 F CFA. Removing live weight from the linear regression model revealed that withers height and thoracic girth could also determine animal prices. This model accounted for 53.12% of the variation in the price per animal. The significant influence of thoracic perimeter and withers height on the price of sheep could be attributed to their role in estimating live weight in the absence of a scale. Regarding qualitative traits breed and coat structure were the most influential factors on animal prices. Additionally, animal prices were significantly higher before the Tabaski festival compared to post-festival prices (84,605.26±9,087.59 vs. 61,905.94±3,941.53 F CFA).

Customization of WHO Under-five Growth Standards for an Appropriate Quantification of Public Health Burden of Growth Faltering in India

Article

Oct 2023
Indian Pediatr

To examine the accuracy of World Health Organization (WHO) growth standard in under-5 year Indian children, and identify a method to contextualize the WHO standard for India. Data of Healthy children, defined by WHO selection criteria, extracted from nationally representative Indian surveys (National Family Health Surveys, NFHS-3, NFHS-4, NFHS-5 and Comprehensive National Nutrition Survey, CNNS). Height for age z score (HAZ) and weight for age z score (WAZ) and weight for height z score (WHZ) distributions in healthy sample were compared against the standard normal. If deviant, age-specific correction factors for z scores were estimated by hierarchical linear mixed effects mean and variance polynomial models. A new term, excess mean risk of growth faltering (EMRGF), was introduced to describe growth faltering. Measure of deviation of HAZ, WAZ and WHZ from standard normal distribution. Correction of WHO growth standards for India leading to accurate prevalence of stunting, underweight and wasting in Indian children using NFHS-5 data. Data on 10,384 healthy under-5 year children were extracted, of which 5377 were boys. Across surveys and metrics, the mean z scores were significantly lower than zero (−0.52 to −0.79). HAZ and WHZ variability (1.16, 1.07) were significantly higher than 1. Derived age-specific corrections reduced the NFHS-5 prevalence of growth faltering by 50%. The national EMRGF (after applying the age-specific correction) for height for age was 15.5% (95%CI:15.3–15.8), and weight for age was 15.0% (95%CI:14.8–15.3), respectively, in NFHS-5. The WHO growth standards need contextual customization for accurate estimation of the burden of growth faltering in under-5 year children in India. When corrected, the burden of growth faltering is lower, by half or more, in all the three indices.

Diallel panel reveals a significant impact of low-frequency genetic variants on gene expression variation in yeast

Article

Feb 2024
MOL SYST BIOL

Unraveling the genetic sources of gene expression variation is essential to better understand the origins of phenotypic diversity in natural populations. Genome-wide association studies identified thousands of variants involved in gene expression variation, however, variants detected only explain part of the heritability. In fact, variants such as low-frequency and structural variants (SVs) are poorly captured in association studies. To assess the impact of these variants on gene expression variation, we explored a half-diallel panel composed of 323 hybrids originated from pairwise crosses of 26 natural Saccharomyces cerevisiae isolates. Using short- and long-read sequencing strategies, we established an exhaustive catalog of single nucleotide polymorphisms (SNPs) and SVs for this panel. Combining this dataset with the transcriptomes of all hybrids, we comprehensively mapped SNPs and SVs associated with gene expression variation. While SVs impact gene expression variation, SNPs exhibit a higher effect size with an overrepresentation of low-frequency variants compared to common ones. These results reinforce the importance of dissecting the heritability of complex traits with a comprehensive catalog of genetic variants at the population level.

Total value of claims in the motor multiperil portfolio

Article

Jan 2023

Alicja Wolny-Dominiak

The genetics of gaits in Icelandic horses goes beyond DMRT3, with RELN and STAU2 identified as two new candidate genes

Article

Full-text available

Dec 2023
GENET SEL EVOL

Background In domesticated animals, many important traits are complex and regulated by a large number of genes, genetic interactions, and environmental influences. The ability of Icelandic horses to perform the gait ‘pace’ is largely influenced by a single mutation in the DMRT3 gene, but genetic modifiers likely exist. The aim of this study was to identify novel genetic factors that influence pacing ability and quality of the gait through a genome-wide association study (GWAS) and correlate new findings to previously identified quantitative trait loci (QTL) and mutations. Results Three hundred and seventy-two Icelandic horses were genotyped with the 670 K+ Axiom Equine Genotyping Array, of which 362 had gait scores from breeding field tests. A GWAS revealed several SNPs on Equus caballus chromosomes (ECA) 4, 9, and 20 that were associated (p < 1.0 × 10–5) with the breeding field test score for pace. The two novel QTL on ECA4 and 9 were located within the RELN and STAU2 genes, respectively, which have previously been associated with locomotor behavior in mice. Haplotypes were identified and the most frequent one for each of these two QTL had a large favorable effect on pace score. The second most frequent haplotype for the RELN gene was positively correlated with scores for tölt, trot, gallop, and canter. Similarly, the second most frequent haplotype for the STAU2 gene had favorable effects on scores for trot and gallop. Different genotype ratios of the haplotypes in the RELN and STAU2 genes were also observed in groups of horses with different levels of pacing ability. Furthermore, interactions (p < 0.05) were detected for the QTL in the RELN and STAU2 genes with the DMRT3 gene. The novel QTL on ECA4, 9, and 20, along with the effects of the DMRT3 variant, were estimated to account jointly for 27.4% of the phenotypic variance of the gait pace. Conclusions Our findings provide valuable information about the genetic architecture of pace beyond the contribution of the DMRT3 gene and indicate genetic interactions that contribute to the complexity of this trait. Further investigation is needed to fully understand the underlying genetic factors and interactions.

Statistical Modelling of Occupant Behaviour

Book

Dec 2023

Non-additive genetic components contribute significantly to population-wide gene expression variation

Article

Full-text available

Dec 2023

Gene expression variation, an essential step between genotype and phenotype, is collectively controlled by local (cis) and distant (trans) regulatory changes. Nevertheless, how these regulatory elements differentially influence gene expression variation remains unclear. Here, we bridge this gap by analyzing the transcriptomes of a large diallel panel consisting of 323 unique hybrids originating from genetically divergent Saccharomyces cerevisiae isolates. Our analysis across 5,087 transcript abundance traits showed that non-additive components account for 36% of the gene expression variance on average. By comparing allele-specific read counts in parent-hybrid trios, we found that trans-regulatory changes underlie the majority of gene expression variation in the population. Remarkably, most cis-regulatory variations are also exaggerated or attenuated by additional trans effects. Overall, we showed that the transcriptome is globally buffered at the genetic level mainly due to trans-regulatory variation in the population.

The pan-genome and local adaptation of Arabidopsis thaliana

Article

Full-text available

Oct 2023

Arabidopsis thaliana serves as a model species for investigating various aspects of plant biology. However, the contribution of genomic structural variations (SVs) and their associate genes to the local adaptation of this widely distribute species remains unclear. Here, we de novo assemble chromosome-level genomes of 32 A. thaliana ecotypes and determine that variable genes expand the gene pool in different ecotypes and thus assist local adaptation. We develop a graph-based pan-genome and identify 61,332 SVs that overlap with 18,883 genes, some of which are highly involved in ecological adaptation of this species. For instance, we observe a specific 332 bp insertion in the promoter region of the HPCA1 gene in the Tibet-0 ecotype that enhances gene expression, thereby promotes adaptation to alpine environments. These findings augment our understanding of the molecular mechanisms underlying the local adaptation of A. thaliana across diverse habitats.

GenStat for Windows (10th Edition) Introduction

Chapter

Full-text available

Jan 2009

Modern Applied Statistics With S

Chapter

Full-text available

Jan 2002

Generalized Linear Models

Article

Dec 1985

Beta-binomial ANOVA for proportions

Article

Jan 1978

Martin J. Crowder

Amethod is proposed for the regression analysis of proportions based on the Beta-binomial distribution.

Approximate Inference In Generalized Linear Mixed Models

Article

Mar 1993

Statistical approaches to overdispersion, correlated errors, shrinkage estimation, and smoothing of regression relationships may be encompassed within the framework of the generalized linear mixed model (GLMM). Given an unobserved vector of random effects, observations are assumed to be conditionally independent with means that depend on the linear predictor through a specified link function and conditional variances that are specified by a variance function, known prior weights and a scale factor. The random effects are assumed to be normally distributed with mean zero and dispersion matrix depending on unknown variance components. For problems involving time series, spatial aggregation and smoothing, the dispersion may be specified in terms of a rank deficient inverse covariance matrix. Approximation of the marginal quasi-likelihood using Laplace's method leads eventually to estimating equations based on penalized quasilikelihood or PQL for the mean parameters and pseudo-likelihood for the variances. Implementation involves repeated calls to normal theory procedures for REML estimation in variance components problems. By means of informal mathematical arguments, simulations and a series of worked examples, we conclude that PQL is of practical value for approximate inference on parameters and realizations of random effects in the hierarchical model. The applications cover overdispersion in binomial proportions of seed germination; longitudinal analysis of attack rates in epilepsy patients; smoothing of birth cohort effects in an age-cohort model of breast cancer incidence; evaluation of curvature of birth cohort effects in a case-control study of childhood cancer and obstetric radiation; spatial aggregation of lip cancer rates in Scottish counties; and the success of salamander matings in a complicated experiment involving crossing of male and female effects. PQL tends to underestimate somewhat the variance components and (in absolute value) fixed effects when applied to clustered binary data, but the situation improves rapidly for binomial observations having denominators greater than one.

Hierarchical Generalized Linear Models

Article

Jan 1996

We consider hierarchical generalized linear models which allow extra error components in the linear predictors of generalized linear models. The distribution of these components is not restricted to be normal; this allows a broader class of models, which includes generalized linear mixed models. We use a generalization of Henderson's joint likelihood, called a hierarchical or h-likelihood, for inferences from hierarchical generalized linear models. This avoids the integration that is necessary when marginal likelihood is used. Under appropriate conditions maximizing the h-likelihood gives fixed effect estimators that are asymptotically equivalent to those obtained from the use of marginal likelihood; at the same time we obtain the random effect estimates that are asymptotically best unbiased predictors. An adjusted profile h-likelihood is shown to give the required generalization of restricted maximum likelihood for the estimation of dispersion components. A scaled deviance test for the goodness of fit, a model selection criterion for choosing between various dispersion models and a graphical method for checking the distributional assumption of random effects are proposed. The ideas of quasi-likelihood and extended quasi-likelihood are generalized to the new class. We give examples of the Poisson-gamma, binomial-beta and gamma-inverse gamma hierarchical generalized linear models. A resolution is proposed for the apparent difference between population-averaged and subject-specific models. A unified framework is provided for viewing and extending many existing methods.

The R Core Team nlme: Linear and Nonlinear Mixed Effects Models

Article

Nov 2007

A Simple Method for Computing the Inverse of a Numerator Relationship Matrix Used in Prediction of Breeding Values

Article

Mar 1976

C. R. Henderson

The inverse of a numerator relationship matrix is needed for best linear unbiased prediction of breeding values. The purpose of this paper to is present a rapid and simple method for computation of the elements of this inverse without computing the relationship matrix itself. The method is particularly useful in non-inbred populations but is much faster than the conventional method in the presence of inbreeding.

REML estimation for binary data in GLMMs

Article

May 2007

The restricted maximum likelihood (REML) procedure is useful for inferences about variance components in mixed linear models. However, its extension to hierarchical generalized linear models (HGLMs) is often hampered by analytically intractable integrals. Numerical integration such as Gauss–Hermite quadrature (GHQ) is generally not recommended when the dimensionality of the integral is high. With binary data various extensions of the REML method have been suggested, but they have had unsatisfactory biases in estimation. In this paper we propose a statistically and computationally efficient REML procedure for the analysis of binary data, which is applicable over a wide class of models and design structures. We propose a bias-correction method for models such as binary matched pairs and discuss how the REML estimating equations for mixed linear models can be modified to implement more general models.

H-likelihood: Problems and solutions

Article

Apr 2007

In recent issues of this journal it has been asserted in two papers that the use of h-likelihood is wrong, in the sense of giving unsatisfactory estimates of some parameters for binary data (Kuk and Cheng, 1999; Waddington and Thompson, 2004) or theoretically unsound (Kuk and Cheng, 1999). We wish to refute both these assertions.

hglm: A Package for Fitting Hierarchical Generalized Linear Models

Abstract and Figures

Recommended publications

Hierarchical Generalized Linear Models

SRC-Stat Package for Fitting Double Hierarchical Generalized Linear Models

Hierarchical Generalized Linear Models

The hglm Package (version 1.2)

Computationally feasible estimation of the covariance structure in generalized linear mixed models

Feasible estimation of generalized linear mixed models (GLMM) with weak dependency between groups

Likelihood Prediction for Generalized Linear Mixed Models under Covariate Uncertainty