ArticlePDF Available

Checking the Short-Term and Long-Term Hazard Ratio Model for Survival Data

September 2012
Scandinavian Journal of Statistics 39(3)

September 2012
39(3)

Authors:

Georgia State University

The short-term and long-term hazard ratio model includes the proportional hazards model and the proportional odds model as submodels, and allows a wider range of hazard ratio patterns compared with some of the more traditional models. We propose two omnibus tests for checking this model, based, respectively, on the martingale residuals and the contrast between the non-parametric and model-based estimators of the survival function. These tests are shown to be consistent against any departure from the model. The empirical behaviours of the tests are studied in simulations, and the tests are illustrated with some real data examples.

Estimated survival curves for the WHI data when fitting model (2) using the placebo group as the control group: solid line-Kaplan-Meier curve for the estrogen plus progestin group; dashed line-model-based estimator for the estrogen plus progestin group.

…

Estimated survival curves for the WHI data when fitting model (2) using estrogen plus progestin group as the control group: Solid line-Kaplan-Meier curve for the placebo group; dashed linemodel-based estimator for the placebo group.

…

Estimated survival curves for the gastric data when fitting model (2) using the chemo group as the control group: solid line-Kaplan-Meier curve for the combination group; dashed linemodel-based estimator for the combination group.

…

Estimated survival curves for the gastric data when fitting model (2) using the combination group as the control group: solid line-Kaplan-Meier curve for the chemo group; dashed line-model-based estimator for the chemo group.

…

Figures - uploaded by Yichuan Zhao

Content may be subject to copyright.

Content uploaded by Yichuan Zhao

Content may be subject to copyright.

Scandinavian Journal of Statistics, Vol. 39: 554–567, 2012

doi: 10.1111/j.1467-9469.2012.00804.x

Checking the Short-Term and Long-Term

Hazard Ratio Model for Survival Data

SONG YANG

Ofﬁce of Biostatistics Research, National Heart, Lung and Blood Institute

YICHUAN ZHAO

Department of Mathematics and Statistics, Georgia State University

ABSTRACT. The short-term and long-term hazard ratio model includes the proportional hazards

model and the proportional odds model as submodels, and allows a wider range of hazard ratio

patterns compared with some of the more traditional models. We propose two omnibus tests for

checking this model, based, respectively, on the martingale residuals and the contrast between the

non-parametric and model-based estimators of the survival function. These tests are shown to be

consistent against any departure from the model. The empirical behaviours of the tests are studied

in simulations, and the tests are illustrated with some real data examples.

Key words: censoring, goodness of ﬁt, martingale residuals, model checking, omnibus test,

survival data

1. Introduction

In clinical trials life testing and reliability studies, the problem of comparing two groups

of data is often encountered. Usually a summary measure is used to capture the difference

between the two groups, and the baseline distribution is left unspeciﬁed. For survival data,

the Cox proportional hazards model (Cox, 1972) is the most widely used and its parameter

has an appealing interpretation as the hazard ratio, or the relative risk, for the two groups.

The proportional hazards model and the derived hazard ratio estimate provide good approxi-

mations in many situations when the hazards of the two groups are nearly proportional.

When the assumption of a constant hazard ratio is in doubt, common alternatives include

the accelerated failure time model (Kalbﬂeisch & Prentice, 2002) and the proportional odds

model (Bennett, 1983). The proportional hazards model and the proportional odds model

also belong to the class of transformation models (Bickel et al., 1993). These alternative

models have been studied extensively in the literature, in works such as Ying (1993), Cheng

et al. (1995), Murphy et al. (1997), Yang & Prentice (1999) and Chen et al. (2002), among

others. In addition to these more established models, other semiparametric models have

also been proposed (e.g. Tsodikov, 2002; Chen & Cheng, 2006; Zeng & Lin, 2007).

Yang & Prentice (2005) proposed a model which includes the proportional hazards model

and the proportional odds model as submodels. Assume absolutely continuous failure time

distributions and label the two groups control and treatment, with cumulative hazard func-

tions C(t) and T(t), and hazard rate functions C(t) and T(t), respectively. The model of

Yang & Prentice (2005) postulates that,

T(t)=12

1+(2−1)SC(t)C(t), (1)

almost everywhere for t<0, where SCis the survivor function of the control group, 1,2>0

and 0is the upper boundary of the support of the control distribution: 0=sup{x:

x

0C(t)dt<∞}.Under this model, 1,2can be interpreted as the short-term and long-term

Scand J Statist 39 Checking the hazard ratio model 555

hazard ratio, respectively, and various patterns of the hazard ratio can be realized, such as

proportional hazards, no initial effect, disappearing effect or crossing hazards. The survivor

functions may also cross, a phenomenon not possible under the linear transformation models

and the accelerated failure time model.

Note that this model has an asymmetric ﬂavour in that interchange of the control and

treatment groups will not result in a model of the same form. Also, in clinical applications,

heavy censoring is often present and there may be no data available near 0. Thus in applica-

tions one is interested in using model (1) for tin [0, ], where <0, a range of interest with

adequate data available.

In this article, we propose two omnibus tests for checking model (1), based, respectively,

on the martingale residuals and the contrast between the non-parametric and model-based

estimators of the survival function. In the literature, the martingale residuals have often been

used to detect departures from the assumed model (Therneau et al., 1990; Lin et al., 1993).

Lin et al. (1993) studied various partial-sum processes of the martingale residuals, and used

simulated Gaussian processes to approximate the distributions of those processes under the

proportional hazards regression model. The martingale residual-based test proposed here is

related to the test for the proportional hazards regression model in Lin et al. (1993), but

different in that we will consider integrals of the martingale residuals.

While the martingale residual-based test and related graphical methods have been shown in

the literature to be extremely useful for the Cox proportional regression model, the

partial-sum processes of martingale residuals themselves do not have a simple interpre-

tation. The contrast-based test uses the non-parametric survival function estimate, and has a

very simple and direct interpretation as the difference between the estimated survival proba-

bilities. Under the null hypothesis that model (1) holds, the distribution of both the martin-

gale residual-based test and the contrast-based test can be approximated through simulations.

We will show that both tests are consistent against any departure from model (1). Various

numerical simulations show that the proposed tests have good empirical size and power. The

proposed procedures will be illustrated in real data examples.

We organize the article as follows: In section 2, distributional results for the stochastic pro-

cesses used for the tests are established. Then the p-values of the tests are studied and the

consistency of the tests against departure from the model is established. In section 3, simula-

tion results and data examples are presented. Some concluding remarks are given in section 4.

Proofs of the asymptotic results are placed in appendix S1.

2. Distributional results

Let T1,...,Tnbe the pooled lifetimes of the two groups, and C1,...,Cnbe the corresponding

censoring variables. Let the sample sizes of the two groups be n1and n2, respectively, and

arrange the indices such that T1,...,Tn1,n1<n, constitute the control group. Let Zi=I(i>n1),

i=1, ...,n, where I(·) is the indicator function. We assume that T1,...,Tn,C1,...,Cnare inde-

pendent. The available data are the triplets (Xi,i,Zi), i=1, …, n, where Xi=Ti∧Ciand

i=I(Ti≤Ci), with ∧denoting the minimum.

To simplify the presentation, throughout this article, we will assume the following condi-

tions.

Condition 1. lim n1/n =∈(0, 1).

Condition 2. The data range of interest is [0, ], where <0. The survivor function Giof

the censoring variable Cigiven Ziis continuous and satisﬁes i≤n1Gi(t)/n1→C(t),

556 S. Yang and Y. Zhao Scand J Statist 39

i>n1Gi(t)/n2→T(t), uniformly in t≤, for some functions C(t), T(t) satisfy-

ing C()T()>0.

Condition 3. The survivor functions SCand STof the two comparison groups are absolutely

continuous, and SC()ST()>0.

Deﬁne R(t)=1/SC(t)−1, t≤, the odds function of the control group. Let =(1,2)T,

where ‘T’ denotes transpose and 1=log 1,2=log2.With this parameterization, we con-

sider the model

i(t)=1

1i()+2i()R(t)

dR(t)

dt,i=1, ...,n, (2)

almost everywhere on t∈[0,], where i(t) is the hazard function for Tigiven Zi, and ji (b)=

exp(−bjZi), j=1, 2, for b=(b1,b2)T.

To develop the tests for model (2), we need an estimator for .Although it is possible to

adopt the pseudo-likelihood approach in Yang & Prentice (2005), here we will use a simpler

estimator. If Rwere known, the log-likelihood function of model (2) would be proportional to

−



i=1

[iln{1i(b)+2i(b)R(Xi)}+ln{1+2i(b)R(Xi)/1i(b)}/2i(b)].

This motivates the following estimator for . Let ˆ

SCbe the Kaplan–Meier estimator of SC

and ˆ

R=1/ˆ

SC−1 (Kaplan & Meier, 1958). Throughout this article, c/d is deﬁned to be zero

when d=0. Let j(b)=exp(−bj), j=1, 2.For satisfying conditions 2 and 3, deﬁne

p(b)=

i>n1I(Xi≤)iln 1(b)+2(b)ˆ

R(Xi)+ln 1+2(b)ˆ

R(Xi∧)/1(b)/2(b).

Then we will use the minimizer ˆ

of p(b) to estimate . Equivalently, ˆ

is the zero of the

gradient ∇p(b)ofp(b). Deﬁne

A(b)=

ln{1(b)+2(b)R(t)}T(t)ST(t)dT(t)+

{1+R(t)}T(t)ST(t)

1(b)+2(b)R(t)dC(t).

It can be checked that, under model (2), ∇A(b)iszeroat. We will also assume the

following condition.

Condition 4. The function A(b) has a unique minimum that occurs at the unique zero of

∇A.

Let Mi(t)=iI(Xi≤t)−t

0I(Xi≥s)/{1i()+2i()R(s)}dR(s), the martingale associated

with the ith study subject, 1 ≤i≤n, where throughout the article we use the notation

u

l=(l,u].Now deﬁne the martingale residuals

Mi(t)=iI(Xi≤t)−t

I(Xi≥s)dˆ

R(s)

1i(ˆ

)+2i(ˆ

)ˆ

R(s),1≤i≤n.

Note that ˆ

Mi(t) does not involve ˆ

for i≤n1. It will be shown later that, under model

(2) and conditions 1–4, ˆ

is strongly consistent for . Thus, i>n1

Mi(t) will be close to

i>n1Mi(t), and will ﬂuctuate around zero under model (2). This leads us to deﬁne a mar-

tingale residual-based test that rejects model (2) if supt≤|i>n1t

0(s)dˆ

Mi(s)|/√nis large,

where is a data-dependent function. The choice of will be discussed more later.

Scand J Statist 39 Checking the hazard ratio model 557

The asymptotic distribution of the test involves that of ˆ

. To describe these asymptotic

distributions, let

M1(t)=1

√n

i≤n1

M1i(t), ¯

M2(t)=1

√n

i>n1

M2i(t),

K1(t)=

i≤n1

I(Xi≥t), K2(t)=

i>n1

I(Xi≥t),

N1(t)=

i≤n1

iI(Xi≤t), N2(t)=

i>n1

iI(Xi≤t).

Deﬁne D(g,b)=1(b)+2(b)gfor g>0 and let

W(g,b)=1(b)

D(g,b),2(b)g

D(g,b)T

,(b)=−{H(˜p/n,b)}−1,

where H(f,b) is the Hessian matrix of the function fat band

˜p(b)=

ln{1(b)+2(b)ˆ

R(t)}dN2(t)+

K2(t)dˆ

R(t)

1(b)+2(b)ˆ

R(t).

Deﬁne

WT(t)=−W(ˆ

R(t), ),

WC(t)=−K2(t)WT(t)

K1(t)D(R(t), )ˆ

SC(t)

K1(t)

K2(s)WT(s)

D(R(s), )2()

D(ˆ

R(s), )ˆ

SC(s)−1dˆ

R(s).

(3)

Let ˆ

WTbe the estimator of WTdeﬁned by replacing and Rwith ˆ

and ˆ

R, respectively.

Similarly, deﬁne an estimator ˆ

WCof WC. We ﬁrst establish the following results.

Theorem 1. Suppose that conditions 1–4 hold. Then, under model (2),

(i) ˆ

is strongly consistent for ;

(ii) √n(ˆ

−)has a limiting zero mean normal distribution. A strongly consistent estimator

of the limiting covariance matrix is given by

(ˆ

)

WTˆ

TdN2/n +

WCˆ

CdN1/n(ˆ

).

For the integrand in the martingale residual-based test, while many choices are possible,

based on a trial-and-error process of simulation studies and real data applications, we will

work with the choice (t)=1(t)2(t), where

1(t)=K1(t)

K1(t)+K2(t)and 2(t)=1+4{K1(t)+K2(t)}

n1−K1(t)+K2(t)

n.

The factor 1(t) is used to help stabilize the integral near the upper tail. The function 2(t)

assigns weights between 1 and 2 to all data points, with more weight in the central region

and less weight towards the boundaries of the data range.

Let

Un(t)=t

(s)d ¯

M2(s)−t

(s)d ¯

M1(s)−t

(s)Tdˆ

R(s)()Q, (4)

where

558 S. Yang and Y. Zhao Scand J Statist 39

(s)=(s)K2(s)

K1(s)D(R(s), )ˆ

SC(s)−1

K1(s)t

(x)K2(x)

D(R(x), )2(ˆ

)

D(ˆ

R(x), ˆ

)ˆ

SC(x)−1dˆ

R(x)

and

(s)=(s)K2(s)

nD(ˆ

R(s), ˆ

)W(R(s), ).

Let ˆbe the estimator of deﬁned by replacing and Rwith ˆ

and ˆ

R, respectively, and

deﬁne the estimator ˆ

of analogously. For the asymptotic distribution of the martingale

residual-based statistical test, we have the following result.

Theorem 2. Suppose that conditions 1–4 hold. Then, under model (2), the process i>n1t

0(s)×

dˆ

Mi(s)/√n,0≤t≤,is asymptotically equivalent to the process Un(t), 0 ≤t≤,deﬁned in (4),

which converges weakly to a zero mean Gaussian process. A strongly consistent estimator of the

covariance process of the limiting Gaussian process is given by

ˆ1(s,t)=N2(s)/n +s

ˆ2(x)dN1(x)/n

+s

(x)Tdˆ

R(x)(ˆ

)

WTˆ

TdN2/n +

WCˆ

CdN1/n

(ˆ

)t

(x)d ˆ

R(x)

−s

(x)Tdˆ

R(x)(ˆ

)t

WTdN2/n −t

ˆˆ

WCdN1/n

−t

(x)Tdˆ

R(x)(ˆ

)s

WTdN2/n −s

ˆˆ

WCdN1/n,0≤s≤t≤.

In the literature, the martingale residuals have been very useful for checking the Cox pro-

portional regression model. However, the partial-sum processes of martingale residuals them-

selves do not have a simple interpretation. Alternative to the martingale residual-based test,

we can also obtain a test by contrasting the non-parametric and model-based estimators for

the survivor function STof the treatment group. Deﬁne ˆ

ST=exp(−ˆ

T), where ˆ

Tis the

Nelson–Aalen estimator for the cumulative hazard function T(Nelson, 1969; Aalen, 1975).

Note that under model (2), we have ST(t)={1()+2()R(t)}−1/2().From this we can deﬁne

ST(t)={1(ˆ

)+2(ˆ

)ˆ

R(t)}−1/2(ˆ

),

to be a model-based estimator for ST. Intuitively, model (2) holds if the model-based survival

estimator is close to the non-parametric estimator. This leads us to deﬁne a contrast-based

test using ˆ

ST(t)−˜

ST(t), the difference between the estimated survival probabilities. Note that

the Kaplan–Meier estimator could also be used in the contrast-based test. In various simu-

lations, ˆ

STresults in a better performance for small samples, hence it is used in deﬁning the

test instead of the Kaplan–Meier estimator.

Let

Vn(t)=−ˆ

ST(t)t

nd¯

M2(s)

K2(s)+(t)t

nd¯

M1(s)

K1(s)+˜

ST(t)B(t)T(ˆ

)Q, (5)

where

(t)=

ST(t)

D(ˆ

R(t), ˆ

)ˆ

SC(t),B(t)=ˆ

R(t)

D(ˆ

R(t), ˆ

),−ln( ˜

ST(t)) −ˆ

R(t)

D(ˆ

R(t), ˆ

)T

The following result establishes the weak convergence of √n{ˆ

ST(t)−˜

ST(t)}on t∈[0, ].

Scand J Statist 39 Checking the hazard ratio model 559

Theorem 3. Suppose that conditions 1–4 are satisﬁed. Then, under model (2), the process √n

{ˆ

ST(t)−˜

ST(t)},0≤t≤, is asymptotically equivalent to the process Vn,0≤t≤, deﬁned in

(5), which converges weakly to a zero mean Gaussian process. A strongly consistent estimator

of the covariance process of the limiting Gaussian process is given by

ˆ2(s,t)=˜

ST(s)˜

ST(t)B(s)T(ˆ

)

WT(x)ˆ

T(x)dN2(x)/n

+

WC(x)ˆ

C(x)dN1(x)/n.(ˆ

)B(t)

+ˆ

ST(s)ˆ

ST(t)s

2(x)dN2(x)+(s)(t)s

1(x)dN1(x)

−ˆ

ST(t)˜

ST(s)B(s)T(ˆ

)t

WT(x)

K2(x)dN2(x)−ˆ

ST(s)˜

ST(t)B(t)T(ˆ

)s

WT(x)

K2(x)dN2(x)

+(t)˜

ST(s)B(s)T(ˆ

)t

WC(x)

K1(x)dN1(x)

+(s)˜

ST(t)B(t)T(ˆ

)s

WC(x)

K1(x)dN1(x), 0 ≤s≤t≤.(6)

With the asymptotic results above, we deﬁne a contrast-based test that rejects model (2)

if √nsupt≤2(t)|ˆ

ST(t)−˜

ST(t)|/ˆ2(t,t) is large. Note that we use the standardized process

in deﬁning the test. Furthermore, the weight function 2(t) is used to moderate the inﬂuence

of data points near the boundaries of the data range. Use of the standardized process and

the weight function is based on various numerical studies that indicate better performance

of this setup.

The p-values of the tests are difﬁcult to obtain analytically. The bootstrap method provides

a practical alternative. It is, however, very time-consuming. The normal resampling approxi-

mation method of Lin et al. (1993) reduces computing time signiﬁcantly, and has become a

standard method. We will modify the method for our problem here.

Let i,i=1, ...,n, be independent variables that are also independent from the data. For

t≤, deﬁne the process

U(t)=1

√n⎡

⎣

i>n1t

d(iNi)−

i≤n1t

ˆd(iNi)

−t

dˆ

R(ˆ

)⎛

⎝

i>n1

WTd(iNi)+

i≤n1

WCd(iNi)⎞

⎠⎤

⎦

√n⎡

⎣

i>n1

XiiiI(Xi≤t)−

i≤n1

iiˆ(Xi)I(Xi≤t)

−t

dˆ

R(ˆ

)⎧

⎨

⎩

i≤n1

iiˆ

WC(Xi)I(Xi≤)+

i>n1

iiˆ

WT(Xi)I(Xi≤)⎫

⎬

⎭⎤

⎦,(7)

and

V(t)=1

√n⎡

⎣−ˆ

ST(t)

i>n1

ii

K2(Xi)I(Xi≤t)+(t)

i≤n1

ii

K1(Xi)I(Xi≤t)

560 S. Yang and Y. Zhao Scand J Statist 39

+˜

ST(t)B(t)T(ˆ

)⎧

⎨

⎩

i≤n1

iiˆ

WC(Xi)I(Xi≤)+

i>n1

iiˆ

WT(Xi)I(Xi≤)⎫

⎬

⎭⎤

⎦.(8)

In the approach of Lin et al. (1993), the ivalues are the standard normal variables. The

standard normal variables sometimes result in inﬂated empirical size in various simulation

studies for our problem here. Thus we need to make some adjustment. Speciﬁcally we will

choose i,i=1, ...,nto be independent normal variables that are independent of the data,

with mean zero and variance c2

nsuch that supncn<∞and cn→1. Conditional on the observed

data (Xi,i,Zi), i=1, ...,n, the processes ˆ

Uand ˆ

Vare sums of nindependent Gaussian pro-

cesses. It can be shown that, given the data, the processes ˆ

Uand ˆ

Vconverge weakly and

have the same limiting process as that of Unand V

n, respectively.

Let ro,cobe the observed values of

sup

t∈[0,]!!!

i≥n1t

(s)d ˆ

Mi(s)!!!"√nand sup

t∈[0, ]

√n{2(t)|ˆ

ST(t)−˜

ST(t)|/ˆ2(t,t)},

respectively. The p-values

P#sup

t∈[0,]!!!

i≥n1t

(s)d ˆ

Mi(s)!!!"√n>ro$,P[ sup

t∈[0, ]

√n{2(t)|ˆ

ST(t)−˜

ST(t)|/ˆ2(t,t)}>co]

can be approximated by

P[ sup

t∈[0,]|ˆ

U(t)|>ro], P[ sup

t∈[0, ]{2(t)|ˆ

V(t)|/ˆ2(t,t)}>co],

respectively, which in turn can be approximated by simulating the conditional distributions

given data a large number of times. We have the following result for the consistency of the

tests.

Theorem 4. Suppose conditions 1–4 are satisﬁed. Then the martingale residual-based test is

consistent against a general departure from model (2) on t ∈[0, ]. The contrast-based test with

the standardized process is consistent against a general departure from model (2) on t ∈[0, ],

except for the degenerate case where 2(t,t)is zero for some t ∈[0, ].

3. Simulation studies and examples

3.1. Simulation studies

To ﬁne-tune the tests and to evaluate their performance, we have conducted various simu-

lation studies. As mentioned before, because of better performance for small samples, ˆ

STis

used in deﬁning the contrast-based test instead of the Kaplan–Meier estimator. Note that the

zero of ∇˜pprovides an alternative estimator for . It was found that the estimator ˆ

gener-

ally results in a better performance than this alternative estimator. For the ivariables in (7),

we can take cn≡1. That is, no modiﬁcation of the original normal approximation method is

needed. For the ivariables in (8), the choice cn=1+1/√nworks well.

Regarding the choice of , we found that in computing ˆ

, we can take =maxiXito include

all data. After ˆ

is obtained, in simulating the conditional distributions of ˆ

U,ˆ

V, we can

take =(maxiXiZi)∧(maxiXi(1 −Zi)). Note that our presentation and proofs work with the

situation where all processes are restricted to [0, ]forin condition 2.

Next, we report the results from some representative simulation studies. All numerical com-

putations were done in Matlab. To evaluate the empirical size of the tests, we ﬁrst generate

Scand J Statist 39 Checking the hazard ratio model 561

data under the model of Yang & Prentice (2005). Lifetime variables were generated with R(t)

being the identity function. The values of were (log(0.9), log(1.2))Tand (log(1.2), log(0.8))T,

representing a one-third increase or decrease from the initial hazard ratio, respectively. We

will refer these two cases as model I and model II, respectively. For the empirical power

evaluation, lifetime variables were generated with the standard exponential distribution for

the control group, and with

T(t)

C(t)=a,t∈(0, 0.5) ∪(1.5, ∞)

=1/a,t∈[0.5, 1.5],

for a=3 and a=0.5. These two cases will be referred to as model III and model IV, respec-

tively. Notice that model III gives a U-shaped hazard ratio while Model IV gives an upside

down U-shaped hazard ratio, as opposed to a monotone hazard ratio implied by the model

of Yang & Prentice (2005). The censoring variables were independent and identically distr-

ibuted with the log-normal distribution, where the normal distribution had mean cand stan-

dard deviation 0.5, with cchosen to achieve various censoring rates. The empirical size and

power were obtained from 1000 repetitions, and for each repetition, the critical values of the

tests were calculated empirically from 1000 realizations of relevant conditional distributions.

The results of these simulations are summarized in Tables 1 and 2.

From Table 1, both tests in general have the correct size. When the censoring is light

or moderate, the tests are generally conservative. On the other hand, when the censoring

is heavy, both tests may have inﬂated size for small sample size. This may be expected as

there is very little information available. For example, with n1=n2=80 and the censoring at

75 per cent, only about 40 of the total 160 data points are not censored. As the sample size

increases, the size generally improves. The multiples used in ivariables could possibly be cho-

sen to even control the size for small sample size with heavy censoring, but that may come

at the expense of making the tests more conservative at light or moderate censoring levels.

From Table 2, under light or moderate censoring, generally the martingale residual-based

test has some advantage for model III while the contrast-based test for model IV. However,

we see almost a reversal in performance when the censoring is heavy. These behaviours indi-

cate that the two tests may possibly be powerful under different classes of alternatives. Also,

both tests have reduced power when there is heavy censoring. One possible reason is that, due

Tabl e 1. Empirical size of the lack-of-ﬁt tests for model (2), at various sample sizes and censoring levels.

=(log(0.9), log(1.2))Tfor model I and =(log(1.2), log(0.8))Tfor model II. The results were based on

1000 repetitions. For each repetition, the critical values of the tests were calculated from 1000 realizations

of relevant conditional distributions

Censoring rate

Model I Model II

Test 10% 30% 50% 75% 10% 30% 50% 75%

n1=n2=40

Residual 0.0210 0.0340 0.0310 0.0460 0.0290 0.0380 0.0330 0.0690

Contrast 0.0220 0.0270 0.0410 0.0470 0.0300 0.0310 0.0370 0.0710

n1=n2=80

Residual 0.0170 0.0170 0.0300 0.0450 0.0260 0.0330 0.0290 0.07300

Contrast 0.0190 0.0180 0.0320 0.0430 0.0220 0.0200 0.0220 0.05600

n1=n2=160

Residual 0.0310 0.0380 0.0360 0.0520 0.0510 0.0400 0.0470 0.0540

Contrast 0.0260 0.0290 0.0420 0.0380 0.0290 0.0240 0.0300 0.0400

562 S. Yang and Y. Zhao Scand J Statist 39

Tabl e 2. Empirical power of the lack-of-ﬁt tests for model (2), at various sample sizes and censoring levels,

at model III with a U-shaped hazard ratio, and model IV with an upside-down U-shaped hazard ratio. The

results were based on 1000 repetitions. For each repetition, the critical values of the tests were calculated

from 1000 realizations of relevant conditional distributions

Censoring rate

Model III Model IV

Test 10% 30% 50% 75% 10% 30% 50% 75%

n1=n2=40

Residual 0.2810 0.3520 0.2330 0.2290 0.2630 0.2610 0.2780 0.1680

Contrast 0.1160 0.1590 0.2260 0.2880 0.5410 0.4530 0.3240 0.1580

n1=n2=80

Residual 0.4390 0.5090 0.2660 0.3440 0.6220 0.6230 0.6260 0.2090

Contrast 0.2190 0.2720 0.2560 0.3450 0.9100 0.8820 0.7010 0.1600

n1=n2=160

Residual 0.7530 0.7920 0.3010 0.3830 0.9320 0.9530 0.9390 0.3650

Contrast 0.7600 0.6410 0.2690 0.3900 0.9960 0.9940 0.9590 0.3210

to heavy censoring, the hazard ratio in the data range is likely monotone instead of U-shaped

or upside down U-shaped, resulting in the power reduction. For example, for the last case

with n1=n2=160, the average of maxiXifrom the simulated data sets, at the four censoring

levels, is 3.89, 2.31, 1.33 and 0.56 for model III, and 4.65, 2.87, 1.76 and 0.82 for model IV.

Note that the U-shape or upside down U-shape is realized over a range containing [0, 1.5].

With heavy censoring, often there may not be enough data in that range. This also gives a

possible reason for the power being higher at 75 per cent censoring than at 50 per cent censor-

ing under model III. At 75 per cent censoring, given the average 0.56 of maxiXi, the available

data are mostly in the [0, 0.5] range where the hazard ratio is constant. This simple situation

possibly yields a better result compared with the situation at 50 per cent censoring, where the

available data are in a range over which the hazard ratio has a rapid descent from 3 to 0.5.

3.2. The Women’s Health Initiative trial

The Women’s Health Initiative (WHI) randomized controlled trial of combined post-

menopausal hormone therapy reported an elevated coronary heart disease risk and overall

unfavourable health beneﬁts versus risks over a 5.6-year study period (Writing Group for the

Women’s Health Initiative Investigators, 2002; Manson et al., 2003). Here we look at the time

to coronary heart disease in the WHI clinical trial, which included 16,608 postmenopausal

women initially in the age range of 50–79 with uterus. The trial has two arms. The placebo

arm has sample size 8102, and the estrogen plus progestin arm has sample size 8506. About

98 per cent of the data are censored, primarily by the trial monitoring time. For this data set,

there was strong evidence that the hazards were non-proportional. In Prentice et al. (2005),

a time axis partition was used and the relative risk estimate was obtained separately over the

intervals 0 −2, 2 −5 and >5 years. Fitting model (2) to this data set with the placebo group

being the control group, we get ˆ

=(0.636, −3.601)T. The martingale residual-based test has

the p-value of 0.438, and the contrast-based test has the p-value of 0.427.Thus both tests

indicate a good ﬁt of the model. Figure 1 gives the plots of the non-parametric survival curve

STand the model-based survival curve ˜

STfor the treatment group, or the estrogen plus pro-

gestin group. We see that the model-based and non-parametric survival curves are very close

to each other.

Now if we switch the group label and use the estrogen plus progestin group as the con-

trol group, then both tests result in p-values less than 0.01, discrediting the model. Figure 2

Scand J Statist 39 Checking the hazard ratio model 563

0 500 1000 1500 2000 2500 3000 3500

0.965

0.97

0.975

0.98

0.985

0.99

0.995

1.005

Days

% Survival

Fig. 1. Estimated survival curves for the WHI data when ﬁtting model (2) using the placebo group as

the control group: solid line – Kaplan–Meier curve for the estrogen plus progestin group; dashed line

– model-based estimator for the estrogen plus progestin group.

0 500 1000 1500 2000 2500 3000 3500

0.965

0.97

0.975

0.98

0.985

0.99

0.995

1.005

Days

% Survival

Fig. 2. Estimated survival curves for the WHI data when ﬁtting model (2) using estrogen plus proges-

tin group as the control group: Solid line – Kaplan–Meier curve for the placebo group; dashed line –

model-based estimator for the placebo group.

gives the plots of ˆ

STand ˜

STfor the placebo arm, which is the ‘treatment group’ under the

assumed model. We see that in this way, compared with the model using the placebo arm as

the control group, there is some noticeable gap between the model-based survival curve and

non-parametric survival curve in the middle of the data range. One possible reason may be

that the data yield a Kaplan–Meier curve for the placebo group which behaves very differ-

ently near the end of the data range, with jumps considerably larger than at early or middle

time points. In comparison, for the estrogen plus progestin group, as displayed in Fig. 1, the

Kaplan–Meier curve behaves more or less linearly throughout, with jumps less dramatic near

the end of the data range.

564 S. Yang and Y. Zhao Scand J Statist 39

3.3. The gastrointestinal tumour study

Next, we look at an example where the hazards are highly non-proportional in that both the

hazard functions and the survivor functions cross. The Gastrointestinal Tumour Study Group

(1982) reported the results of a trial that compared chemotherapy with combined chemo-

therapy and radiation therapy, in the treatment of locally unresectable gastric cancer. There

were 45 patients on each treatment. Two observations in the chemotherapy group and six

in the combination group were censored. Kaplan–Meier plots of the two estimated survival

curves cross at around 1000 days (see p. 10 of Yang & Prentice, 2005). If we let the chemo-

therapy group be the control group and ﬁt model (2) to the data, then we have ˆ

=(1.714,

−0.981)T. The martingale residual-based test has the p-value of 0.104, and the contrast-based

test has the p-value of 0.595. Thus the martingale residual-based test signals some degree of

lack of ﬁt, while the contrast-based test indicates a good ﬁt. Figure 3 gives the plots of ˆ

and ˜

STfor the treatment group, with combined chemotherapy and radiation therapy. We see

that the model-based survival curve and the non-parametric survival curve are mostly close

to each other, except for a very small region around 230 days, where both the model-based

survival curve and the non-parametric survival curve descend rapidly and have their largest

discrepancy. This discrepancy affects the martingale residual-based test more than the con-

trast-based test. Plots of i>n1t

0(s)dˆ

Mi(s)/√nand a few realizations of the process ˆ

U(t)

given the data, shown in appendix S2, suggest that the low p-value of the martingale resid-

ual-based test is mainly caused by behaviours of the processes on that small region.

Now if we let the combined chemotherapy and radiation therapy group be the control

group and ﬁt model (2) to the data, then the martingale residual-based test has the p-value

of 0.634, while the contrast-based test has the p-value of 0.354. Figure 4 gives the plots of

STand ˜

STfor the chemotherapy group, which is the ‘treatment group’ under the assumed

model. We see that, compared with the scenario in Fig. 3 before, the difference between the

model-based survival curve and non-parametric survival curve is more pronounced over most

of the data range. Here the model ﬁt seems worse compared with that in Fig. 3, but the

0 500 1000 1500 2000 2500 3000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

Days

% Survival

Fig. 3. Estimated survival curves for the gastric data when ﬁtting model (2) using the chemo group

as the control group: solid line – Kaplan–Meier curve for the combination group; dashed line –

model-based estimator for the combination group.

Scand J Statist 39 Checking the hazard ratio model 565

0 500 1000 1500 2000 2500 3000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

Days

% Survival

Fig. 4. Estimated survival curves for the gastric data when ﬁtting model (2) using the combination group

as the control group: solid line – Kaplan–Meier curve for the chemo group; dashed line – model-based

estimator for the chemo group.

p-value of the martingale residual-based test is less extreme. Plots of i>n1t

0(s)d ˆ

Mi(s)/√n

and a few realizations of the process ˆ

U(t) given the data suggest that, for tup to about 700

days, the variance process of ˆ

U(t) is generally larger than for the case in Fig. 3, helping to

avoidalowp-value for the martingale residual-based test. These behaviours may plausibly

be due to the high non-proportionality of the two groups combined with a moderate sample

size.

4. Discussion

We have investigated the tests using the supremum norm of the relevant processes. Alterna-

tively, Kolmogorov–Smirnov type tests can be considered using the integrated absolute value

of the processes. Such integrated-type of test statistics are related to the Cramer–von Mises

tests. Also, contrast between the Nelson–Aalen estimator of the cumulative hazard and the

model-based estimator can be used instead of the contrast between the Kaplan–Meier esti-

mator and the model-based estimator. In some preliminary simulation studies, those tests did

not provide improvement over the tests we focused on here, and thus are omitted.

The model of Yang & Prentice (2005) implies that the hazard ratio is monotone. While this

covers many practical situations, other patterns of the hazard ratio are possible. When the

tests indicate a lack of ﬁt for the model of Yang & Prentice (2005), it is possible to remedy

the situation by considering larger classes of semiparametric models to incorporate an even

wider range of hazard ratio patterns. Also, in addition to the two-sample case considered

here, adjustment for covariates may be considered.

Acknowledgements

The authors thank the editors and reviewers for their constructive comments and sugges-

tions that have led to an improved manuscript. The research of Yichuan Zhao was partially

supported by NSA grant #H98230-12-1-0209.

566 S. Yang and Y. Zhao Scand J Statist 39

Supporting Information

Additional Supporting Information may be found in the online version of this article:

Appendix S1. Proofs of asymptotic results.

Appendix S2. Additional plots for the gastrointestinal tumor study.

Please note: Wiley-Blackwell are not responsible for the content or functionality of any

supporting materials supplied by the authors. Any queries (other than missing material) should

be directed to the corresponding author for the article.

References

Aalen, O. O. (1975). Statistical inference for a family of counting processes. PhD thesis, University of

California, Berkeley.

Bennett, S. (1983). Analysis of survival data by the proportional odds model. Statist. Med. 2,

273–277.

Bickel, P. J., Klaassen, C. A. J., Ritov, Y. & Wellner, J. A. (1993). Efﬁcient and adaptive estimation for

semiparametric models. Johns Hopkins University Press, Baltimore.

Chen, Y. Q. & Cheng, S. (2006). Linear life expectancy regression with censored data. Biometrika 93,

303–313.

Chen, K., Jin, Z. & Ying, Z. (2002). Semiparametric analysis of transformation models with censored

data. Biometrika 89, 659–668.

Cheng, S. C., Wei, L. J. & Ying, Z. (1995). Analysis of transformation models with censored data. Bio-

metrika 82, 835–845.

Cox, D. R. (1972). Regression models and life-tables (with discussion). J. Roy. Statist. Soc. B 34,

187–220.

Gastrointestinal Tumor Study Group: Schein, P. D., Bruckner, H. W., Douglass, H. O., Mayer, R. et al.

(1982). A comparison of combination chemotherapy and combined modality therapy for locally

advanced gastric carcinoma. Cancer 49, 1771–1777.

Kalbﬂeisch, J. D. & Prentice, R. L. (2002), The statistical analysis of failure time data, 2nd edn. Wiley,

Hoboken, NJ.

Kaplan, E. & Meier, P. (1958). Nonparametric estimation from incomplete observations. J. Amer. Statist.

Assoc. 53, 457–481.

Lin, D. Y., Wei, L. J. & Ying, Z. (1993). Checking the Cox model with cumulative sums of martingale-

based residuals. Biometrika 80, 557–572.

Manson, J. E., Hsia, J., Johnson, K. C., Rossouw, J. E., Assaf, A. R., Lasser, N. L., Trevisan, M., Black,

H. R., Heckbert, S. R., Detrano, R., Strickland, O. L.,Wong, N. D., Crouse, J. R., Stein, E. & Cush-

man, M., for the Women’s Health Initiative Investigators (2003). Estrogen plus progestin and the risk

of coronary heart disease. New Engl. J. Med.349, 523–534.

Murphy, S. A., Rossini, A. J. & Van der Vaart, A. W. (1997). Maximal likelihood estimate in the pro-

portional odds model. J. Amer. Statist. Assoc. 92, 968–976.

Nelson, W. (1969). Hazard plotting for incomplete failure data. J. Qual. Technol. 1, 27–52.

Prentice, R. L., Langer, R., Stefanick, M. L., Howard, B. V., Pettinger, M., Anderson, G., Barad,

D., Curb, J. D., Kotchen, J., Kuller, L., Limacher, M. & Wactawski-Wende, J., for the Women’s

Health Initiative Investigators (2005). Combined postmenopausal hormone therapy and cardiovascular

disease: toward resolving the discrepancy between observational studies and the women’s health initia-

tive clinical trial. Am. J. Epi. 162, 404–414.

Therneau, T. M., Grambsch, P. M. & Fleming, T. R. (1990). Martingale-based residuals for survival

models. Biometrika 77, 147–160.

Tsodikov, A. (2002). Semi-parametric models of long- and short-term survival: an application to the

analysis of breast cancer survival in Utah by age and state. Statist. Med. 21, 895–920.

Writing Group for the Women’s Health Initiative Investigators (2002). Risks and beneﬁts of estrogen

plus progestin in healthy postmenopausal women: principal results from the women’s health initiative

randomized controlled trial. J. Amer. Med. Assoc.288, 321–333.

Yang, S. & Prentice, R. L. (1999). Semiparametric inference in the proportional odds regression model.

J. Amer. Statist. Assoc. 94, 125–136.

Yang, S. & Prentice, R. L. (2005). Semiparametric analysis of short-term and long-term hazard ratios

with two-sample survival data. Biometrika 92, 1–17.

Scand J Statist 39 Checking the hazard ratio model 567

Ying, Z. (1993). A large sample study of rank estimation for censored regression data. Ann. Statist. 21,

76–99.

Zeng, D. & Lin, D. Y. (2007). Maximum likelihood estimation in semiparametric regression models with

censored data. J. Roy. Statist. Soc. B 69, 507–564.

Received January 2010, in ﬁnal form March 2012

Song Yang, PhD, Ofﬁce of Biostatistics Research, Division of Cardiovascular Sciences, National Heart,

Lung and Blood Institute, NIH, DHHS, 6701 Rockledge Dr. MSC 7913, Bethesda, MD 20892, USA.

E-mail: yangso@nhlbi.nih.gov

Semiparametric Analysis of Short-Term and Long-Term Hazard Ratio Model with Length-Biased and Right-Censored Data

Article

Full-text available

Jan 2020
STAT SINICA

Critical Review of Oncology Clinical Trial Design Under Non-proportional Hazards

Article

May 2021
CRIT REV ONCOL HEMAT

In trials of novel immuno-oncology drugs, the proportional hazards (PH) assumption often does not hold for the primary time-to-event (TTE) efficacy endpoint, likely due to the unique mechanism of action of these drugs. In practice, when it is anticipated that PH may not hold for the TTE endpoint with respect to treatment, the sample size is often still calculated under the PH assumption, and the hazard ratio (HR) from the Cox model is still reported as the primary measure of the treatment effect. Sensitivity analyses of the TTE data using methods that are suitable under non-proportional hazards (non-PH) are commonly pre-planned. In cases where a substantial deviation from the PH assumption is likely, we suggest designing the trial, calculating the sample size and analyzing the data, using a suitable method that accounts for non-PH, after gaining alignment with regulatory authorities. In this comprehensive review article, we describe methods to design a randomized oncology trial, calculate the sample size, analyze the trial data and obtain summary measures of the treatment effect in the presence of non-PH. For each method, we provide examples of its use from the recent oncology trials literature. We also summarize in the Appendix some methods to conduct sensitivity analyses for overall survival (OS) when patients in a randomized trial switch or cross-over to the other treatment arm after disease progression on the initial treatment arm, and obtain an adjusted or weighted HR for OS in the presence of cross-over. This is an example of the treatment itself changing at a specific point in time - this cross-over may lead to a non-PH pattern of diminishing treatment effect.

A fully likelihood-based approach to model survival data with crossing survival curves

Preprint

Oct 2019

Proportional hazards (PH), proportional odds (PO) and accelerated failure time (AFT) models have been widely used to deal with survival data in different fields of knowledge. Despite their popularity, such models are not suitable to handle survival data with crossing survival curves. Yang and Prentice (2005) proposed a semiparametric two-sample approach, denoted here as the YP model, allowing the analysis of crossing survival curves and including the PH and PO configurations as particular cases. In a general regression setting, the present work proposes a fully likelihood-based approach to fit the YP model. The main idea is to model the baseline hazard via the piecewise exponential (PE) distribution. The approach shares the flexibility of the semiparametric models and the tractability of the parametric representations. An extensive simulation study is developed to evaluate the performance of the proposed model. In addition, we demonstrate how useful is the new method through the analysis of survival times related to patients enrolled in a cancer clinical trial. The simulation results indicate that our model performs well for moderate sample sizes in the general regression setting. A superior performance is also observed with respect to the original YP model designed for the two-sample scenario.

Expected length of stay at residential aged care facilities in Australia: current and future

Article

Full-text available

Sep 2023
J Popul Res

This study explores the changing patterns of the length of stay (LOS) at Australian residential aged care facilities during 2008–2018 and likely trends up to 2040. The expected LOS was estimated via the hazard function of exiting from such a facility and its heterogeneity by residents’ sociodemographic characteristics using an improved Cox regression model. Data were sourced from the Australian Institute of Health and Welfare. In-sample modelling results reveal that the estimated LOS differed by age (in general, shorter for older groups), marital status (longer for the widowed) and sex (longer for females). In addition, the estimated LOS increased slowly from 2008–2009 to 2016–2017 but declined steadily thereafter. Out-of-sample predictions suggest that the declining trend of the estimated LOS will continue until 2040 and that the longest LOS (approximately 37 months) will be observed among widowed females aged 50–79 years. Relative uncertainty measures are provided. The results portray the current changing landscape and the future trend of residential aged care use in Australia, which can inform the development of optimised residential aged care policies to support ageing Australians more effectively.

The Expected Length of Stay At Aged Care Facilities in Australia: Current and Future

Preprint

Full-text available

Aug 2021

This paper analyzes the hazard functions of exiting from an aged care facility in Australia. Using a comprehensive dataset ranging over 2008--2018, we find that those functions are heterogeneous across the age, sex and year-of-leaving. The modelling results lead to in-sample estimated expected length of stay (LOS) for residents differed by age (in general, longer for older groups) and sex (longer for females). The estimated LOS declines gradually from 2008 to 2014 and then steadily increase afterwards for all heterogeneous age and sex groups. Out-of-sample predictions up to 2100 suggest that the longest LOS belongs to females aged 100 and older, with the estimated/predicted LOS increasing from 58.6 months in 2018 to 68.9 months in 2100. Relative uncertainty measures are also provided. Those results can be used to explore the nature of and aspects to improve service quality of Australian aged care facilities for policy makers.

Note on the Role of the Placebo Group in The Short-Term and Long Term Hazard Ratio Model

Article

Sep 2020

Unified Methods for Feature Selection in Large-Scale Genomic Studies with Censored Survival Outcomes

Article

Mar 2020
BIOINFORMATICS

Motivation: One of the major goals in large-scale genomic studies is to identify genes with a prognostic impact on time-to-event outcomes which provide insight into the disease's process. With rapid developments in high-throughput genomic technologies in the past two decades, the scientific community is able to monitor the expression levels of tens of thousands of genes and proteins resulting in enormous data sets where the number of genomic features is far greater than the number of subjects. Methods based on univariate Cox regression are often used to select genomic features related to survival outcome; however, the Cox model assumes proportional hazards (PH), which is unlikely to hold for each feature. When applied to genomic features exhibiting some form of non-proportional hazards (NPH), these methods could lead to an under- or over-estimation of the effects. We propose a broad array of marginal screening techniques that aid in feature ranking and selection by accommodating various forms of NPH. First, we develop an approach based on Kullback-Leibler information divergence and the Yang-Prentice model that includes methods for the PH and proportional odds (PO) models as special cases. Next, we propose R2 measures for the PH and PO models that can be interpreted in terms of explained randomness. Lastly, we propose a generalized pseudo-R2 index that includes PH, PO, crossing hazards and crossing odds models as special cases and can be interpreted as the percentage of separability between subjects experiencing the event and not experiencing the event according to feature measurements. Results: We evaluate the performance of our measures using extensive simulation studies and publicly available data sets in cancer genomics. We demonstrate that the proposed methods successfully address the issue of NPH in genomic feature selection and outperform existing methods. Availability: R code for the proposed methods is available at github.com/lburns27/Feature-Selection. Supplementary information: Supplementary data are available at Bioinformatics online.

An Unified Semiparametric Approach to Model Lifetime Data with Crossing Survival Curves

Preprint

Oct 2019

The proportional hazards (PH), proportional odds (PO) and accelerated failure time (AFT) models have been widely used in different applications of survival analysis. Despite their popularity, these models are not suitable to handle lifetime data with crossing survival curves. In 2005, Yang and Prentice proposed a semiparametric two-sample strategy (YP model), including the PH and PO frameworks as particular cases, to deal with this type of data. Assuming a general regression setting, the present paper proposes an unified approach to fit the YP model by employing Bernstein polynomials to manage the baseline hazard and odds under both the frequentist and Bayesian frameworks. The use of the Bernstein polynomials has some advantages: it allows for uniform approximation of the baseline distribution, it leads to closed-form expressions for all baseline functions, it simplifies the inference procedure, and the presence of a continuous survival function allows a more accurate estimation of the crossing survival time. Extensive simulation studies are carried out to evaluate the behavior of the models. The analysis of a clinical trial data set, related to non-small-cell lung cancer, is also developed as an illustration. Our findings indicate that assuming the usual PH model, ignoring the existing crossing survival feature in the real data, is a serious mistake with implications for those patients in the initial stage of treatment.

A new modeling and inference approach for the Systolic Blood Pressure Intervention Trial outcomes

Article

Apr 2018

Background/aims In clinical trials with time-to-event outcomes, usually the significance tests and confidence intervals are based on a proportional hazards model. Thus, the temporal pattern of the treatment effect is not directly considered. This could be problematic if the proportional hazards assumption is violated, as such violation could impact both interim and final estimates of the treatment effect. Methods We describe the application of inference procedures developed recently in the literature for time-to-event outcomes when the treatment effect may or may not be time-dependent. The inference procedures are based on a new model which contains the proportional hazards model as a sub-model. The temporal pattern of the treatment effect can then be expressed and displayed. The average hazard ratio is used as the summary measure of the treatment effect. The test of the null hypothesis uses adaptive weights that often lead to improvement in power over the log-rank test. Results Without needing to assume proportional hazards, the new approach yields results consistent with previously published findings in the Systolic Blood Pressure Intervention Trial. It provides a visual display of the time course of the treatment effect. At four of the five scheduled interim looks, the new approach yields smaller p values than the log-rank test. The average hazard ratio and its confidence interval indicates a treatment effect nearly a year earlier than a restricted mean survival time–based approach. Conclusion When the hazards are proportional between the comparison groups, the new methods yield results very close to the traditional approaches. When the proportional hazards assumption is violated, the new methods continue to be applicable and can potentially be more sensitive to departure from the null hypothesis.

Improving testing and description of treatment effect in clinical trials with survival outcomes

Article

Oct 2017

Song Yang

Cox model inference and the log‐rank test have been the cornerstones for design and analysis of clinical trials with survival outcomes. In this article, we summarize some recently developed methods for analyzing survival data when the hazards may possibly be nonproportional and also propose some new estimators for summary measures of the treatment effect. These methods utilize the short‐term and long‐term hazard ratio model proposed in Yang and Prentice (2005), which contains the Cox model and also accommodates various nonproportional hazards scenarios. Without the proportional hazards assumption, these methods often improve the log‐rank test and inference procedures based on the Cox model, as well as nonparametric procedures currently available in the literature. The proposed methods have sound theoretical justifications and can be computed quickly. R codes for implementing them are available. Detailed illustrations with 3 clinical trials are provided.

Risks and benefits of estrogen plus progestin in healthy postmenopausal women: principal results from the women's health initiative randomized controlled trial

Article

Jan 2002

Hazard Plotting for Incomplete Failure Data

Article

Jan 1969

Wayne Nelson

Incomplete failure data consisting of times to failure on failed units and differing running times on unfailed units are called multiply censored. Data on units operating in the field, for example, are usually multiply censored. Presented in this paper is a method of plotting multiply censored data on hazard paper to obtain engineering information on the distribution of time to failure. Step-by-step instructions on how to plot and interpret data on hazard paper are given with the aid of examples based on real and simulated data. Hazard paper is presented here for the exponential, Weibull, normal, log normal, and extreme value distributions. The theory underlying hazard paper and plotting is presented in an appendix.

Semiparametric analysis of transformation model with censored data

Article

Feb 2002
BIOMETRIKA

A unified estimation procedure is proposed for the analysis of censored data using linear transformation models, which include the proportional hazards model and the proportional odds model as special cases. This procedure is easily implemented numerically and its validity does not rely on the assumption of independence between the covariates and the censoring variable. The estimator is the same as the Cox partial likelihood estimator in the case of the proportional hazards model. Moreover, the asymptotic variance of the proposed estimator has a closed form and its variance estimator is easily obtained by plug-in rules. The method is illustrated by simulation and is applied to the Veterans’ Administration lung cancer data.

Analysis of Transformation Models with Censored Data

Article

Dec 1995
BIOMETRIKA

In this paper we consider a class of semi-parametric transformation models, under which an unknown transformation of the survival time is linearly related to the covariates with various completely specified error distributions. This class of regression models includes the proportional hazards and proportional odds models. Inference procedures derived from a class of generalised estimating equations are proposed to examine the covariate effects with censored observations. Numerical studies are conducted to investigate the properties of our proposals for practical sample sizes. These transformation models, coupled with the new simple inference procedures, provide many useful alternatives to the Cox regression model in survival analysis.

Maximum Likelihood Estimation in the Proportional Odds Model

Article

Sep 1997

We consider maximum likelihood estimation of the parameters in the proportional odds model with right-censored data. The estimator of the regression coefficient is shown to be asymptotically normal with efficient variance. The maximum likelihood estimator of the unknown monotonic transformation of the survival time converges uniformly at a parametric rate to the true transformation. Estimates for the standard errors of the estimated regression coefficients are obtained by differentiation of the profile likelihood and are shown to be consistent. A likelihood ratio test for the regression coefficient is also considered.

Semiparametric Inference in the Proportional Odds Regression Model

Article

Mar 1999

For fitting the proportional odds regression model with right-censored survival times, we introduce some weighted empirical odds functions. These functions are solutions of some self-consistency equations and have a nice martingale representation. From these functions, several classes of new regression estimators, such as the pseudo–maximum likelihood estimator, martingale residual-based estimators, and minimum distance estimators, are derived. These estimators have desirable properties such as easy computation, asymptotic normality via a martingale analysis, and reliable asymptotic covariance estimation in closed form. Extensive numerical studies show that the minimum L2 distance estimators have very good finite-sample behaviors compared to existing methods. Results of some simulation studies and applications to a real dataset are given. The weighted odds function–based approach also provides inference on the baseline odds function and some measures for lack-of-fit analysis.

Non Parametric Estimation From Incomplete Observation

Article

Nov 1957
J AM STAT ASSOC

Martingale-Based Residuals for Survival Models

Article

Mar 1990
BIOMETRIKA

Graphical methods based on the analysis of residuals are considered for the setting of the highly-used D. R. Cox [J. R. Stat. Soc., Ser. B 34, 187-220 (1972; Zbl 0243.62041)] regression model and for the P. K. Andersen and R. D. Gill [Ann. Stat. 10, 1100-1120 (1982; Zbl 0526.62026)] generalization of that model. We start with a class of martingale-based residuals as proposed by W. E. Barlow and R. L. Prentice [Biometrika 75, 65-74 (1988; Zbl 0632.62102)]. These residuals and/or their transforms are useful for investigating the functional form of covariate, the proportional hazards assumption, the leverage of each subject upon the estimates of β, and the lack of model fit to a given subject.

Efficient and Adaptive Estimation in Non-and Semiparametric Models

Article

Jan 1993

Checking the Cox Model with Cumulative Sums of Martingale-Based Residuals

Article

Sep 1993
BIOMETRIKA

This paper presents a new class of graphical and numerical methods for checking the adequacy of the Cox regression model. The procedures are derived from cumulative sums of martingale-based residuals over follow-up time and/or covariate values. The distributions of these stochastic processes under the assumed model can be approximated by zero-mean Gaussian processes. Each observed process can then be compared, both visually and analytically, with a number of simulated realizations from the approximate null distribution. These comparisons enable the data analyst to assess objectively how unusual the observed residual patterns are. Special attention is given to checking the functional form of a covariate, the form of the link function, and the validity of the proportional hazards assumption. An omnibus test, consistent against any model misspecification, is also studied. The proposed techniques are illustrated with two real data sets.

Checking the Short-Term and Long-Term Hazard Ratio Model for Survival Data

Abstract and Figures

Recommended publications

Application of Total Least Squares Method to Measurement of Gun Barrel Direction

Convergence and steady-state analysis of a variable step-size NLMS algorithm

On the evaluation of fit measures for quasi-orders

Self-modeling and Self-reflection of E-learning Communities.