Article

Residual Bootstrap Test for Interactions in Biomarker Threshold Models with Survival Data

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Many new treatments in cancer clinical trials tend to benefit a subset of patients more. To avoid unnecessary therapies and failure to recognize beneficial treatments, biomarker threshold models are often used to identify this subset of patients. We are interested in testing the treatment–biomarker interaction effects in a threshold model with biomarker but an unknown cut point. The unknown cut point causes irregularity in the model, and the traditional likelihood ratio test cannot be applied directly. A test for biomarker–treatment interaction effects is developed using a residual bootstrap method to approximate the distribution of the proposed test statistic. We evaluate the residual bootstrap and the permutation methods through extensive simulation study and find that the residual bootstrap method gives accurate test size, while the permutation method cannot control type I error sometimes in the presence of main treatment effects. The proposed residual bootstrap test can be used to explore potential treatment-by-biomarker interaction in clinical studies. The findings can be applied to guide the follow-up trial design using biomarker as a stratification factor. We apply the proposed residual bootstrap method to data from Breast International Group (BIG) 1-98 randomized clinical trial and show that patients with high Ki-67 level may benefit more from Letrozole treatment.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Further, Fan et al. (2017) recently considers testing and identifying a subgroup with an enhanced treatment effect by testing H 0 : λ 0 = 0 in a model similar to (1.1) but assumes that γ 0 = 0. In both literature related to prognostic classification and Fan et al. (2017), the setup is non-standard in the sense that c 0 is not identifiable under the null (Davies, (Jiang et al., 2007;He, 2014;Gavanji et al., 2018;Götte et al., 2020) consider adjustments to the minimum p-value statistics for survival endpoints in the context of predictive classification. However, no theoretical justification has been provided for the size validity of the adjusted tests. ...
... For example, when the clinical outcome is survival, Jiang et al. (2007) proposed a biomarker-adaptive threshold model combining a test for overall treatment effect to find a cutpoint for a prespecified biomarker. Chen et al. (2014) and Gavanji et al. (2018) proposed respectively Bayesian and residual bootstrap methods for the inferences of parameters in a threshold Cox proportional hazards model to identify treatment-sensitive subgroups based on survival data. ...
Article
Full-text available
Subgroup identification has emerged as a popular statistical tool to access the heterogeneity in treatment effects based on specific characteristics of patients. Recently, a threshold linear mixed-effects model was proposed to identify a subgroup of patients who may benefit from treatment concerning longitudinal outcomes based on whether a continuous biomarker exceeds an unknown cut-point. This model assumes, however, normal distributions to both random effects and error terms and, therefore, may be sensitive to outliers in the longitudinal outcomes. In this paper, we propose a robust subgroup identification method for longitudinal data by developing a robust threshold t linear mixed-effects model, where random effects and within-subject errors follow a multivariate t distribution, with unknown degree-of-freedoms. The likelihood function is, however, difficult to directly maximize because the density function of a non-central t distribution is too complicated to compute and an indicator function is included in the definition of the mode. We, therefore, propose a smoothed expectation conditional maximization algorithm based on a gamma-normal hierarchical structure and the smooth approximation of an indicator function to make inferences on the parameters in the model. Simulation studies are conducted to investigate the performance and robustness of the proposed method. As an application, the proposed method is used to identify a subgroup of patients with advanced colorectal cancer who may have a better quality of life when treated by cetuximab.
... The independent data bootstrap has a corresponding sampling method for cross-sectional data, time-series data and panel data (Kim and Lee, 2018). For example, the problem of heteroscedasticity in data can be found in Su and Yang (2018); the time-series problems are mentioned in block bootstrap (Pešta, 2017), residual bootstrap (Gavanji et al., 2018), parametric bootstrap (Boynton and Chen, 2018) and so on. ...
Article
Purpose The purpose of this paper is to introduce error ellipse into the bootstrap method to improve the reliability of small samples and the credibility of the S-N curve. Design/methodology/approach Based on the bootstrap method and the reliability of the original samples, two error ellipse models are proposed. The error ellipse model reasonably predicts that the discrete law of expanded virtual samples obeys two-dimensional normal distribution. Findings By comparing parameters obtained by the bootstrap method, improved bootstrap method (normal distribution) and error ellipse methods, it is found that the error ellipse method achieves the expansion of sampling range and shortens the confidence interval, which improves the accuracy of the estimation of parameters with small samples. Through case analysis, it is proved that the tangent error ellipse method is feasible, and the series of S-N curves is reasonable by the tangent error ellipse method. Originality/value The error ellipse methods can lay a technical foundation for life prediction of products and have a progressive significance for the quality evaluation of products.
Article
In a clinical trial, the responses to the new treatment may vary among patient subsets with different characteristics in a biomarker. It is often necessary to examine whether there is a cutpoint for the biomarker that divides the patients into two subsets of those with more favourable and less favourable responses. More generally, we approach this problem as a test of homogeneity in the effects of a set of covariates in generalized linear regression models. The unknown cutpoint results in a model with nonidentifiability and a nonsmooth likelihood function to which the ordinary likelihood methods do not apply. We first use a smooth continuous function to approximate the indicator function defining the patient subsets. We then propose a penalized likelihood ratio test to overcome the model irregularities. Under the null hypothesis, we prove that the asymptotic distribution of the proposed test statistic is a mixture of chi‐squared distributions. Our method is based on established asymptotic theory, is simple to use, and works in a general framework that includes logistic, Poisson, and linear regression models. In extensive simulation studies, we find that the proposed test works well in terms of size and power. We further demonstrate the use of the proposed method by applying it to clinical trial data from the Digitalis Investigation Group (DIG) on heart failure.
Article
Full-text available
The discovery of biomarkers that predict treatment effectiveness has great potential for improving medical care, particularly in oncology. These biomarkers are increasingly reported on a continuous scale, allowing investigators to explore how treatment efficacy varies as the biomarker values continuously increase, as opposed to using arbitrary categories of expression levels resulting in a loss of information. In the age of biomarkers as continuous predictors (eg, expression level percentage rather than positive v negative), alternatives to such dichotomized analyses are needed. The purpose of this article is to provide an overview of an intuitive statistical approach-the subpopulation treatment effect pattern plot (STEPP)-for evaluating treatment-effect heterogeneity when a biomarker is measured on a continuous scale. STEPP graphically explores the patterns of treatment effect across overlapping intervals of the biomarker values. As an example, STEPP methodology is used to explore patterns of treatment effect for varying levels of the biomarker Ki-67 in the BIG (Breast International Group) 1-98 randomized clinical trial comparing letrozole with tamoxifen as adjuvant therapy for postmenopausal women with hormone receptor-positive breast cancer. STEPP analyses showed patients with higher Ki-67 values who were assigned to receive tamoxifen had the poorest prognosis and may benefit most from letrozole.
Article
Full-text available
We consider a nonregular Cox model for independent and identically distributed right censored survival times, with a change-point according to the unknown threshold of a covariate. The maximum partial likelihood estimators of the parameters and the estimator of the baseline cumulative hazard are studied. We prove that the estimator of the change-point is n-consistent and the estimator of the regression parameters are $n^{1/2}$-consistent, and we establish the asymptotic distributions of the estimators. The estimators of the regression parameters and of the baseline cumulative hazard are adaptive in the sense that they do not depend on the knowledge of the change-point.
Article
Full-text available
I-SPY 2 (investigation of serial studies to predict your therapeutic response with imaging and molecular analysis 2) is a process targeting the rapid, focused clinical development of paired oncologic therapies and biomarkers. The framework is an adaptive phase II clinical trial design in the neoadjuvant setting for women with locally advanced breast cancer. I-SPY 2 is a collaborative effort among academic investigators, the National Cancer Institute, the US Food and Drug Administration, and the pharmaceutical and biotechnology industries under the auspices of the Foundation for the National Institutes of Health Biomarkers Consortium.
Article
Full-text available
Many molecularly targeted anticancer agents entering the definitive stage of clinical development benefit only a subset of treated patients. This may lead to missing effective agents by the traditional broad-eligibility randomized trials due to the dilution of the overall treatment effect. We propose a statistically rigorous biomarker-adaptive threshold phase III design for settings in which a putative biomarker to identify patients who are sensitive to the new agent is measured on a continuous or graded scale. The design combines a test for overall treatment effect in all randomly assigned patients with the establishment and validation of a cut point for a prespecified biomarker of the sensitive subpopulation. The performance of the biomarker-adaptive design, relative to a traditional design that ignores the biomarker, was evaluated in a simulation study. The biomarker-adaptive design was also used to analyze data from a prostate cancer trial. In the simulation study, the biomarker-adaptive design preserved the power to detect the overall effect when the new treatment is broadly effective. When the proportion of sensitive patients as identified by the biomarker is low, the proposed design provided a substantial improvement in efficiency compared with the traditional trial design. Recommendations for sample size planning and implementation of the biomarker-adaptive design are provided. A statistically valid test for a biomarker-defined subset effect can be prospectively incorporated into a randomized phase III design without compromising the ability to detect an overall effect if the intervention is beneficial in a broad population.
Article
We introduce the subpopulation treatment effect pattern plot (STEPP) method, designed to facilitate the interpretation of estimates of treatment effect derived from different but potentially overlapping subsets of clinical trial data. In particular, we consider sequences of subpopulations defined with respect to a covariate, and obtain confidence bands for the collection of treatment effects (here obtained from the Cox proportional hazards model) associated with the sequences. The method is aimed at determining whether the magnitude of the treatment effect changes as a function of the values of the covariate. We apply STEPP to a breast cancer clinical trial data set to evaluate the treatment effect as a function of the oestrogen receptor content of the primary tumour. Copyright (C) 2000 John Wiley & Sons, Ltd.
Article
The established molecular heterogeneity of human cancers and the subsequent stratification of conventional diagnostic categories require the development of new paradigms for the development of a reliable basis for predictive medicine. We review clinical trial designs for the development of new therapeutics and predictive biomarkers to inform their use. We cover designs for a wide range of settings. At one extreme is the development of a new drug with a single biomarker and strong biological evidence that marker negative patients are unlikely to benefit from the new drug. At the other extreme are phase III clinical trials involving both genome-wide discovery and internal validation of a predictive classifier that identifies the patients most likely and unlikely to benefit from the new drug.
Article
Adaptive designs have become popular in clinical trial and drug development. Unlike traditional trial designs, adaptive designs use accumulating data to modify the ongoing trial without undermining the integrity and validity of the trial. As a result, adaptive designs provide a flexible and effective way to conduct clinical trials. The designs have potential advantages of improving the study power, reducing sample size and total cost, treating more patients with more effective treatments, identifying efficacious drugs for specific subgroups of patients based on their biomarker profiles, and shortening the time for drug development. In this article, we review adaptive designs commonly used in clinical trials and investigate several aspects of the designs, including the dose-finding scheme, interim analysis, adaptive randomization, biomarker-guided randomization, and seamless designs. For illustration, we provide examples of real trials conducted with adaptive designs. We also discuss practical issues from the perspective of using adaptive designs in oncology trials.
Article
Some baseline patient factors, such as biomarkers, are useful in predicting patients’ responses to a new therapy. Identification of such factors is important in enhancing treatment outcomes, avoiding potentially toxic therapy that is destined to fail and improving the cost-effectiveness of treatment. Many of the biomarkers, such as gene expression, are measured on a continuous scale. A threshold of the biomarker is often needed to define a sensitive subset for making easy clinical decisions. A novel hierarchical Bayesian method is developed to make statistical inference simultaneously on the threshold and the treatment effect restricted on the sensitive subset defined by the biomarker threshold. In the proposed method, the threshold parameter is treated as a random variable that takes values with a certain probability distribution. The observed data are used to estimate parameters in the prior distribution for the threshold, so that the posterior is less dependent on the prior assumption. The proposed Bayesian method is evaluated through simulation studies. Compared to the existing approaches such as the profile likelihood method, which makes inferences about the threshold parameter using the bootstrap, the proposed method provides better finite sample properties in terms of the coverage probability of a 95% credible interval. The proposed method is also applied to a clinical trial of prostate cancer with the serum prostatic acid phosphatase (AP) biomarker.
Article
This paper presents theory and applications of a simple graphical method, called hazard plotting, for the analysis of multiply censored life data consisting of failure times of failed units intermixed with running times on unfailed units. Applications of the method are given for multiply censored data on service life of equipment, for strength data on an item with different failure modes, and for biological data multiply censored on both sides from paired comparisons. Theory for the hazard plotting method, which is based on the hazard function of a distribution, is developed from the properties of order statistics from Type II multiply censored samples.
Article
A resampling plan is introduced for bootstrapping regression parameter estimators for the Cox (1972) proportional hazards regression model when explanatory variables are nonrandom constants fixed by the design of the experiment. The plan is an analog to the residual-resampling method for regression introduced by Efron (1979) and is related to the resampling method proposed by Hjort (1985) for the Coxmodel. The resampled quantities are a form of generalized residuals which have a distribution that is independent of the explanatory variables. Hence, unlike some methods, this approach does not require resampling of explanatory variables, which would be contrary to the assumption that they are nonrandom. An invariance property of the Cox likelihood allows these residuals to be transformed into a convenientscale for generating a likelihood. Also, the method can incorporate many forms of censoring. A simuation study of the proposed procedure shows that it can be used to improve upon the usual estimation procedures for regression parameters in the Cox model.
Article
T. M. Therneau, P. M. Grambsch and T. R. Fleming [Biometrika 77, No. 1, 147-160 (1990)] used martingale residual plots to study the threshold effect of some covariates in a proportional hazard regression model for survival data subject to right censoring. We show that the maximum partial likelihood estimate provides an asymptotically consistent estimator for the unknown threshold. This procedure is illustrated by applying it to a data set from a cohort of patients with B-lineage leukemia treated at St. Jude Children’s Research Hospital.
Article
Let $\mathbf{B} = (N_1, \cdots, N_k)$ be a multivariate counting process and let $\mathscr{F}_t$ be the collection of all events observed on the time interval $\lbrack 0, t\rbrack.$ The intensity process is given by $\Lambda_i(t) = \lim_{h \downarrow 0} \frac{1}{h}E(N_i(t + h) - N_i(t) \mid \mathscr{F}_t)\quad i = 1, \cdots, k.$ We give an application of the recently developed martingale-based approach to the study of $\mathbf{N}$ via $\mathbf{\Lambda}.$ A statistical model is defined by letting $\Lambda_i(t) = \alpha_i(t)Y_i(t), i = 1, \cdots, k,$ where $\mathbf{\alpha} = (\alpha_1, \cdots, \alpha_k)$ is an unknown nonnegative function while $\mathbf{Y} = (Y_1, \cdots, Y_k),$ together with $\mathbf{N},$ is a process observable over a certain time interval. Special cases are time-continuous Markov chains on finite state spaces, birth and death processes and models for survival analysis with censored data. The model is termed nonparametric when $\mathbf{\alpha}$ is allowed to vary arbitrarily except for regularity conditions. The existence of complete and sufficient statistics for this model is studied. An empirical process estimating $\beta_i(t) = \int^t_0 \alpha_i(s) ds$ is given and studied by means of the theory of stochastic integrals. This empirical process is intended for plotting purposes and it generalizes the empirical cumulative hazard rate from survival analysis and is related to the product limit estimator. Consistency and weak convergence results are given. Tests for comparison of two counting processes, generalizing the two sample rank tests, are defined and studied. Finally, an application to a set of biological data is given.
Article
We consider modelling interaction between a categoric covariate T and a continuous covariate Z in a regression model. Here T represents the two treatment arms in a parallel-group clinical trial and Z is a prognostic factor which may influence response to treatment (known as a predictive factor). Generalization to more than two treatments is straightforward. The usual approach to analysis is to categorize Z into groups according to cutpoint(s) and to analyse the interaction in a model with main effects and multiplicative terms. The cutpoint approach raises several well-known and difficult issues for the analyst. We propose an alternative approach based on fractional polynomial (FP) modelling of Z in all patients and at each level of T. Other prognostic variables can also be incorporated by first constructing a multivariable adjustment model which may contain binary covariates and FP transformations of continuous covariates other than Z. The main step involves FP modelling of Z and testing equality of regression coefficients between treatment groups in an interaction model adjusted for other covariates. Extensive experience suggests that a two-term fractional polynomial (FP2) function may describe the effect of a prognostic factor on a survival outcome quite well. In a controlled trial, this FP2 function describes the prognostic effect averaged over the treatment groups. We refit this function in each treatment group to see if there are substantial differences between groups. Allowing different parameter values for the chosen FP2 function is flexible enough to detect such differences. Within the same algorithm we can also deal with the conceptually different cases of a predefined hypothesis of interaction or searching for interactions. We demonstrate the ability of the approach to detect and display treatment/covariate interactions in two examples from controlled trials in cancer.
Article
Current staging and risk-stratification methods in oncology, while helpful, fail to adequately predict malignancy aggressiveness and/or response to specific treatment. Increased knowledge of cancer biology is generating promising marker candidates for more accurate diagnosis, prognosis assessment, and therapeutic targeting. To apply these exciting results to maximize patient benefit, a disciplined application of well-designed clinical trials for assessing the utility of markers should be used. In this article, we first review the major issues to consider when designing a clinical trial assessing the usefulness of a predictive marker. We then present two classes of clinical trial designs: the Marker by Treatment Interaction Design and the Marker-Based Strategy Design. In the first design, we assume that the marker splits the population into groups in which the efficacy of a particular treatment will differ. This design can be viewed as a classical randomized clinical trial with upfront stratification for the marker. In the second design, after the marker status is known, each patient is randomly assigned either to have therapy determined by their marker status or to receive therapy independent of marker status. The predictive value of the marker is assessed by comparing the outcome of all patients in the marker-based arm to that of all of the patients in the non-marker-based arm. We present detailed sample size calculations for a specific clinical scenario. We discuss the advantages and disadvantages of the two trial designs and their appropriateness to specific clinical situations to assist investigators seeking to design rigorous, marker-based clinical trials.
Article
In medical research, continuous variables are often converted into categorical variables by grouping values into two or more categories. We consider in detail issues pertaining to creating just two groups, a common approach in clinical research. We argue that the simplicity achieved is gained at a cost; dichotomization may create rather than avoid problems, notably a considerable loss of power and residual confounding. In addition, the use of a data-derived 'optimal' cutpoint leads to serious bias. We illustrate the impact of dichotomization of continuous predictor variables using as a detailed case study a randomized trial in primary biliary cirrhosis. Dichotomization of continuous data is unnecessary for statistical analysis and in particular should not be applied to explanatory variables in regression models.