ArticlePDF AvailableLiterature Review

Statistical Considerations for NonInferiority/Equivalence Trials in Vaccine Development

February 2006
Journal of Biopharmaceutical Statistics 16(4):429-41

February 2006
16(4):429-41

DOI:10.1080/10543400600719251

Source
PubMed

Authors:

Devan V Mehrotra

Merck & Co.

Noninferioritylequivalence designs are often used in vaccine clinical trials. The goal of these designs is to demonstrate that a new vaccine, or new formulation or regimen of an existing vaccine, is similar in terms of effectiveness to the existing vaccine, while offering such advantages as easier manufacturing, easier administration, lower cost, or improved safety profile. These noninferioritylequivalence designs are particularly useful in four common types of immunogenicity trials: vaccine bridging trials, combination vaccine trials, vaccine concomitant use trials, and vaccine consistency lot trials. In this paper, we give an overview of the key statistical issues and recent developments for noninferioritylequivalence vaccine trials. Specifically, we cover the following topics: (i) selection of study endpoints; (ii) formulation of the null and alternative hypotheses; (iii) determination of the noninferioritylequivalence margin; (iv) selection of efficient statistical methods for the statistical analysis of noninferioritylequivalence vaccine trials, with particular emphasis on adjustment for stratification factors and missing pre-or post-vaccination data; and (v) the calculation of sample size and power.

Comparison of using null variances with CMH weights versus using observed variances with either CMH or MR weights

…

Figures - uploaded by Devan V Mehrotra

Content may be subject to copyright.

Content uploaded by Devan V Mehrotra

Content may be subject to copyright.

Journal of Biopharmaceutical Statistics, 16: 429–441, 2006

ISSN: 1054-3406 print/1520-5711 online

DOI: 10.1080/10543400600719251

STATISTICAL CONSIDERATIONS

FOR NONINFERIORITY/EQUIVALENCE TRIALS

IN VACCINE DEVELOPMENT

W. W. B. Wang, D. V. Mehrotra, I. S. F. Chan, and J. F. Heyse

Clinical Biostatistics, Merck Research Laboratories, North Wales,

Pennsylvania, USA

Noninferiority/equivalence designs are often used in vaccine clinical trials. The goal of

these designs is to demonstrate that a new vaccine, or new formulation or regimen of

an existing vaccine, is similar in terms of effectiveness to the existing vaccine, while

offering such advantages as easier manufacturing, easier administration, lower cost, or

improved safety proﬁle. These noninferiority/equivalence designs are particularly useful

in four common types of immunogenicity trials: vaccine bridging trials, combination

vaccine trials, vaccine concomitant use trials, and vaccine consistency lot trials. In this

paper, we give an overview of the key statistical issues and recent developments for

noninferiority/equivalence vaccine trials. Speciﬁcally, we cover the following topics: (i)

selection of study endpoints; (ii) formulation of the null and alternative hypotheses;

(iii) determination of the noninferiority/equivalence margin; (iv) selection of efﬁcient

statistical methods for the statistical analysis of noninferiority/equivalence vaccine

trials, with particular emphasis on adjustment for stratiﬁcation factors and missing pre-

or post-vaccination data; and (v) the calculation of sample size and power.

Key Words: Equivalence; Minimum risk weights; Missing data; Noninferiority; Stratiﬁed analysis;

Vaccine clinical trial.

1. INTRODUCTION

The usual goal of vaccination is to simulate an in vivo pathogen-speciﬁc

exposure that triggers the host’s immune system to generate a pool of effector

and memory B or T cells that will protect against potential real exposures in the

future. The simulation is accomplished via inoculation of the host by a vaccine that

contains either a live, attenuated version of the pathogen, or a DNA plasmid or

viral vector that encodes relevant genes of the pathogen to help elicit a cell-mediated

immune response, and so on.

The immunogenicity of a new vaccine is studied in the early stages of

development to assess whether the vaccine can induce quantiﬁable levels of

pathogen-speciﬁc immune responses. In later stages, the goal is to assess whether

the immune marker used to quantify vaccine immunogenicity qualiﬁes as a correlate

Received and Accepted March 13, 2006

Address correspondence to William W.B. Wang, Clinical Biostatistics, Merck Research

Laboratories, UG-1CD, P.O. Box 1000, North Wales, Pennsylvania 19454 USA; E-mail:

William_Wang@Merck.Com

429

430 WANG ET AL.

of (or surrogate for) disease protection. Once a correlate for disease protection has

been established, immunogenicity trials with noninferiority/equivalence designs are

widely used as economic and time-efﬁcient alternatives to large efﬁcacy trials in

evaluating the effectiveness of a new or reformulated vaccine as compared to already

licensed vaccines (Chan et al., 2003; Horne, 1995; Mehrotra, in press).

Noninferiority/equivalence designs are particularly useful in four common

types of vaccine immunogenicity trials: vaccine bridging trials, combination vaccine

trials, vaccine concomitant use trials, and vaccine consistency lot trials. Each of

these trials serves a different purpose. For example, vaccine bridging trials are

used because manufacturing processes or storage conditions may be changed after

vaccine licensure to improve production yields and vaccine stability/shelf life.

Vaccine bridging trials are often required to demonstrate that such changes do

not have an adverse impact on vaccine effectiveness, by ruling out a clinically

signiﬁcant difference in immune responses between the modiﬁed vaccine and the

current vaccine.

Combination vaccine trials are typically used to rule out clinically signiﬁcant

differences in immune responses between a combined vaccine and separate but

simultaneously administered vaccines (Blackwelder, 1995; FDA, 1997; Horne et al.,

2001). A combination vaccine is intended to prevent multiple diseases or to prevent

one disease caused by different strains or serotypes of the same organism while

reducing the number of injections required (Chan et al., 2003). Similarly, since

the concomitant administration of multiple vaccines can reduce the number of

vaccination visits, vaccine concomitant use trials are used to rule out clinically

signiﬁcant differences in immune responses between the concomitant administration

of two or more vaccines and the separate administration of each vaccine.

Finally, since vaccines are biological products that are not as stable and

well characterized as chemical drug products, vaccine consistency lot trials are

required to study multiple (typically, three) lots of vaccines made from the same

manufacturing process (called consistency lots). The goal is to rule out a clinically

signiﬁcant difference in immunogenicity in either direction between any two pairs

of consistency lots.

The goal of this article is to provide an overview of key statistical

issues and recent developments for noninferiority/equivalence vaccine trials.

Some of the issues and methodologies that we discuss are also applicable to

drug noninferiority/equivalence trials. However, there are noteworthy differences

between drug and vaccine trials; where appropriate, we will draw attention to

such differences. The rest of the paper is organized as follows. In Section 2, we

describe typical immunogenicity endpoints and noninferiority/equivalence margins

used in practice, and discuss the formulation of the null and alternative hypotheses.

In Section 3, we discuss statistical analysis methods, including novel approaches

for dealing with stratiﬁcation and missing data either at pre- or post-vaccination.

We discuss sample size and power considerations in Section 4, and offer some

concluding remarks in Section 5.

2. ENDPOINTS, HYPOTHESES, AND MARGIN SELECTION

The selection of immunogenicity endpoints, the formulation of study

hypotheses, and the choice of the noninferiority/equivalence margin should be

CONSIDERATIONS FOR NONINFERIORITY/EQUIVALENCE TRIALS 431

based on sound clinical, regulatory, and statistical judgments. A commonly used

immunologic endpoint is the immune response rate, deﬁned as the percentage of

subjects who achieve a certain level of immune response after vaccination (yes/no).

If a particular level of immune response (either based on an absolute cutoff or a

fold rise from baseline) has been shown to be correlated with disease protection,

the percentage of participants achieving this “protective level” after vaccination is

usually considered the primary endpoint for immunogenicity analyses. However,

this deﬁnition is susceptible to potential issues in assay stability and consistency

across subjects, study sites, and assay-performing laboratories. Accordingly, another

commonly used immunologic endpoint is the geometric mean concentration (GMC)

of immune response after vaccination or the geometric mean fold rise (GMFR)

from pre-vaccination to post-vaccination. In the absence of a reasonable protective

level or a good correlate of protection, the GMC or GMFR is used as the primary

immunogenicity endpoint. In addition, as noted by Plikaytis and Carlone (2004),

the selection of endpoints can depend on the type of vaccine, such as T-cell

independent versus T-cell dependent, and the kinetic curve of immune responses

after vaccination.

The primary objective of a vaccine noninferiority/equivalence trial is to

demonstrate that a new or modiﬁed vaccine is noninferior or equivalent to the

current vaccine by ruling out a prespeciﬁed clinically relevant difference in the

immune response. Accordingly, in a noninferiority immunogenicity trial without

stratiﬁcation, the null and alternative hypotheses are generally set up as

H0≤−0versus H1>−0(1)

In this hypothesis setup, depending on the endpoint being used, =PT−PC

or =logGMCT−logGMCC, where PTand PCare the population immune

response rates, and GMCTand GMCCare the population geometric means of the

immune response (either in terms of concentration or in terms of fold rise) after

vaccination, for the new vaccine and the control vaccine groups, respectively. 0is a

prespeciﬁed small positive quantity deﬁning the noninferiority margin. The nominal

signiﬁcance level for this one-sided test (usually one-sided =0025) is generally set

at half of the conventional signiﬁcance level for a two-sided test for a difference

in proportions (Schuirmann, 1987). This approach has been adopted in regulatory

environments, as suggested in the International Conference on Harmonization E9

(ICH E9) guidelines (1999).

In vaccine equivalence trials, such as consistency-lot studies, the objective is to

show that the two vaccines are similar by ruling out a clinically signiﬁcant difference

in either direction. Such studies are typically designed as a two-sided (at the =005

level) equivalence trial with respect to both immune response rates and GMCs. For

such trials, the hypothesis in Eq. (1) is replaced with H0≤−0or ≥0versus

H1−0<<

0, and the approach of two one-sided tests (Schuirmann, 1987) is

recommended by the ICH E9 guidelines (1999).

With respect to the noninferiority margin, the choice of 0should justify that

the new vaccine preserves a large proportion of the effectiveness of the control

vaccine relative to a placebo. Although a noninferiority margin that preserves 50%

of the treatment effect has been proposed for the evaluation of drug treatments

(Ebbutt and Frith, 1999; Temple, 1996), there is a general perception that a

432 WANG ET AL.

narrower margin should be used in preventive vaccine trials because vaccines

are given to potentially millions of healthy individuals for prophylaxis. The

noninferiority or equivalence margin for vaccine immunogenicity should depend

on the level of the correlation between immune responses and the vaccine efﬁcacy,

the variability of immunogenicity responses, the class of vaccine being tested, and

the relative importance of the immunogenicity endpoints in the given trial. For

example, in studies of the hepatitis A vaccine VAQTA®, the immune response rates

in terms of seroconversion in the VAQTA®vaccine groups are usually greater than

90% (with 0% in the placebo group) and are highly correlated with protection.

A margin to preserve half of that is obviously inadequate; instead, a 0of 10

percentage points has been used as the noninferiority margin (Frey et al., 1999). For

comparing GMCs, a 0of log(0.67) or log(0.5) (corresponding to a 1.5- or 2-fold

difference) has been used. The noninferiority margin should be discussed proactively

between the trial sponsor and the regulatory agencies. Temple (1996), Ebbutt and

Frith (1999), and both the ICH E9 (1999) and ICH E10 guidelines (2000) and the

EMEA guideline (2005) include some more general discussions on the choice of

noninferiority and equivalence margins.

3. STATISTICAL METHODS FOR VACCINE

NONINFERIORITY/EQUIVALENCE TRIALS

The analysis of noninferiority/equivalence trials is generally based on the dual-

use of hypothesis testing and test-based conﬁdence intervals. Both p-value and

conﬁdence intervals are reported in most trials. Operationally, rejection of the null

hypothesis in Eq. (1) at one-sided level is equivalent to the lower bound of the

two-sided (1 −2) conﬁdence interval for be greater than −0.

Asymptotic statistical tests of hypothesis in Eq. (1) and corresponding test-

based conﬁdence intervals for two treatment groups with a dichotomous endpoint

have been extensively discussed in the literature in assessing a vaccine effect based

on the difference of two immune response rates. Many authors have proposed

Z-type test statistics with different standard error estimates (Blackwelder, 1982;

Miettinen and Nurminen, 1985). The more commonly used and better performing

of these is the Z-type statistic, shown below, proposed by Miettinen and Nurminen

(1985), Farrington and Manning (1990), Chan and Zhang (1999).

ZD=

PT−

PC+0

˜02

(2)

where

˜02 =1

NT

PT1−

PT+1

NC

PC1−

PC1/2

(3)

and 

PT

PCare the constrained maximum likelihood estimates (MLEs) of (PT,PC

under the null hypothesis given in Eq. (1) based on the observed responses 

PT

PC.

(Detailed expressions for 

PT

PCare presented by Miettinen and Nurminen

(1985), as well as by Farrington and Manning (1990)). Noninferiority is established

CONSIDERATIONS FOR NONINFERIORITY/EQUIVALENCE TRIALS 433

(H0is rejected) at the one-sided signiﬁcance level if ZD≥Zwhere Zis the

1001− percentile of the standard normal distribution.

The method in the preceding paragraph works exceptionally with large

sample sizes. With “small” sample sizes (say n<50 per group), exact (rather than

asymptotic theory-based) tests and corresponding conﬁdence intervals (Agresti and

Min, 2001; Chan, 1998, 2002, 2003; Chan and Zhang, 1999; Mehrotra et al., 2003)

are preferable. However, as noted by Mehrotra et al. (2003), the test statistic for

exact inference should be carefully chosen to avoid a loss in power. For example,

the only option in the StatXact-4 software package for calculating exact conﬁdence

intervals used the numerator of ZDas the test statistic; the power based on this

unstandardized statistic can be substantially lower than the power based on ZD.

Fortunately, StatXact-5 includes other options for calculating exact unconditional

test-based conﬁdence intervals, including Chan and Zhang’s (1999) method based

on inverting two one-sided tests using the ZDstatistic, and the Agresti and Min’s

(2001) method of inverting one two-sided test. It should be noted that with relatively

large sample sizes (say, n>200 per group), the exact and corresponding asymptotic

methods usually yield very similar results, so use of the former is not necessary.

Of note, since noninferiority trials focus on one-sided (level) hypothesis

tests, we recommend that the corresponding exact conﬁdence intervals (usually

at two-sided 1 −2level) be obtained by inverting two one-sided tests to ensure

consistency of inference in the sense that rejection of the noninferiority null

hypothesis in Eq. (1) (p-value ≤ is equivalent to the lower bound of the two-

sided 1−2 CI on the difference being greater than −0(Chan, 2003; Chan and

Mehrotra, 2003). The conﬁdence interval obtained by inverting one two-sided test

(Agresti and Min, 2001), although generally narrower than the conﬁdence intervals

obtained by inverting two one-sided tests, do not guarantee control of the error rate

on each side at the level; hence, it may produce results that are inconsistent with

the one-sided noninferiority hypothesis test (Chan, 2003).

In assessing a vaccine effect based on a fold difference in GMCs between

the vaccine and control groups, the statistical testing for hypothesis in Eq. (1)

and the corresponding conﬁdence interval estimation are often performed by using

an analysis of variance (ANOVA), an analysis of covariance (ANCOVA), or a

linear mixed-effects model that includes the natural log of immune responses as the

dependent variable, and the treatment group, baseline, and stratiﬁcation factors (if

any) as the explanatory variables.

Statistical Methods for Stratiﬁed Trials

If it is known a priori that certain prognostic factors, such as the subject’s age,

gender, or pre-vaccination immune status, will inﬂuence the response to vaccination,

then the strategy of pre-stratiﬁcation by such factors is often used in clinical trials

to facilitate unbiased and more efﬁcient comparisons of treatment groups. For a

stratiﬁed trial, as noted by Mehrotra (2002), it is important to be clear about what

hypothesis is being tested; see also Gail et al. (1996) and Ganju and Mehrotra

(2003). For a noninferiority vaccine trial, the hypothesis that is usually of interest is

H0=



i=1

fii≤−0versus H1=



i=1

fii>−0(4)

434 WANG ET AL.

where iis the true difference in response rates (or means) for stratum i, and fiis the

fraction of subjects in the target population that belong to stratum i

s

i=1fi=1.

A test of hypothesis Eq. (4) is typically conducted using the following statistic:

Zw=

w+0



V0ˆ

w

(5)

In Eq. (5), ˆ

w=s

i=1wiˆ

i=s

i=1wi

PiT −

PiC , and wiis the weight assigned to the

ith stratum s

i=1wi=1. There are two options for the denominator in Eq. (5).

The ﬁrst option is to use the null variance 

V0ˆ

w=s

i=1w2

i

V0ˆ

i, where 

V0ˆ

i=

1

NiT 

PiT 1−

PiT +1

NiC 

PiC 1−

PiC , with 

PiT 

PiC as before. The second option is

to replace 

V0ˆ

iwith the observed variance (Blackwelder, 1995; Dunnett and Gent,

1977) ˆpiC1−ˆpiC

niC

+ˆpiT 1−ˆpiT 

niT . In either case, the null hypothesis is rejected at the

one-sided level if Zw>Z

.

For the weights, wi, two popular choices are the Cochran-Mantel-Haenszel

(CMH) weights, and the precision or inverse-variance (INVAR) weights (Mehrotra

and Railkar, 2000). The CMH weights are proportional to the harmonic means

of the observed stratum-speciﬁc sample sizes, while the INVAR weights are

proportional to the reciprocals of the observed variances of the stratum-speciﬁc

differences. The SSIZE weights are optimal (in terms of power) if the true odds

ratios, piT 1−piC 

piC 1−piT , are constant across strata, while the INVAR weights are optimal

if the iare constant. Unfortunately, in practice, we rarely know whether it is

the true stratum-speciﬁc odds ratios or the ithat are “closer” to being constant

across strata, so the choice between SSIZE and INVAR is essentially a gamble.

To help minimize the potential loss in power that might be incurred by gambling

unfavorably, Mehrotra and Railkar (2000) proposed a “minimum risk” (MR)

weighting strategy which yields an estimate of that has the smallest asymptotic

mean squared error, and offers power advantages as well.

The MR weighting strategy can be particularly useful for stratiﬁed

noninferiority vaccine trials. To illustrate, Table 1 contains simulation results

comparing the null variance (MN) method with CMH weights, and the observed

variance method with CMH or MR weights for a stratiﬁed analysis of a difference

in two proportions. In more general settings, it has been shown that using the null

variance is typically better than using the observed variance in terms of power,

Type I error rate control, and conﬁdence interval coverage (Dunnett and Gent,

1977; Miettinen and Nurminen, 1985). However, for stratiﬁed noninferiority trials

involving vaccines with high response rates (>90%), the results in Table 1 suggest

that using the observed variance along with MR rather than CMH weights can yield

notable gains in power. Intuitively, this is caused by the relatively large separation

of 

PiT and 

PiC under the constraint of 

PiT −

PiC =−0, which in turn leads to a

relatively large estimated null variance as compared with the observed variance. Of

note, the same result holds for unstratiﬁed trials, or for stratiﬁed trials in which

stratiﬁcation is ignored at the time of analysis and an unstratiﬁed analysis is used;

the latter is not recommended because it can result in a loss in power (Mehrotra

and Railkar, 2000; Mehrotra, 2001).

CONSIDERATIONS FOR NONINFERIORITY/EQUIVALENCE TRIALS 435

Table 1 Comparison of using null variances with CMH weights versus using observed variances with

either CMH or MR weights

Stratum 1 Stratum 2 Methods

p1Tp1Cp2Tp2C

0N/trt CMH_OBS CMH_MN MR_OBS

Empirical Type I error rates (nominal = 2.5%, 1-sided)

.35 50 50 65 −015 15 130 252724

.60 70 80 90 −010 10 200 212324

.85 90 90 95 −005 05 350 222223

Empirical power (%)

.50 50 80 80 0 15 130 791801802

.70 70 90 90 0 10 200 790788833

.90 90 95 95 0 05 350 752740779

f1=3, f2=7); 10000 simulations;

CMH_OBS = CMH weights with observed variance;

CMH_MN = CMH weights with null variance (MN Method);

MR_OBS = Minimum Risk weights with observed variance

Another important issue in stratiﬁed data analysis for noninferiority trials is

the test of treatment by stratum interaction. Gail and Simon (1985) used the terms

quantitative and qualitative interactions and proposed likelihood-based testing

procedures. Pan and Wolfe (1997) generalized their methodology to interactions

with clinical signiﬁcance, that is, clinically meaningful interaction. For quantitative

interaction with binary responses, Mehrotra (2002) proposed a new test, which, in a

comprehensive simulation study done by Mehrotra and Chan (2000) controlled the

Type I error rate and was generally at least as powerful as several other published

methods. Wiens and Heyse (2003) presented and compared ﬁve different analysis

strategies to test for qualitative treatment-by-stratum interaction in noninferiority

trials.

Statistical Methods in Handling Missing Immunogenicity Data

Most vaccine regimens include a sequence of one or more “priming”

inoculations followed by a “booster” shot later, if necessary. Blood samples

are collected at one or more time points after each inoculation and assayed

for immune activity. Missing data and losses to follow-up do occur in vaccine

noninferiority/equivalence (and superiority) trials. This situation is similar to the

incomplete longitudinal data problem for drug trials. However, there are two key

differences. First, while the missing data resulting from dropouts in vaccine trials

are typically missing completely at random (MCAR), they are more likely to be

either missing at random (MAR) or non-ignorably missing (NM) for drug trials.

The reason is that patients often drop out from drug trials because they are not

responding favorably to their assigned treatment (e.g., high blood pressure not

declining); this concept is generally not applicable for vaccine trials! The second

key difference is that the ability to predict or impute the missing data at, say,

436 WANG ET AL.

the post-boost visit of interest may be better for vaccine trials compared with

drug trials. This happens because subjects in vaccine trials are inherently less

heterogeneous than patients in drug trials. Moreover, for some (but not all) vaccines,

the post-prime responses are highly correlated with (and hence predictive of) the

post-boost responses.

A common source of missing data in vaccine immunogenicity trials is

mishaps or errors in blood sample storage, sample handling, or assay testing for

immunogenicity measurements. In such trials, missing data can occur at baseline

(pre-vaccination) as at post-vaccination. This type of missing data can complicate

the data analysis if the baseline immune status is to be used as a covariate, as, for

example, in an ANCOVA model for continuous responses with the baseline value

as the covariate. A simple and common way to tackle this missing immunogenicity

data problem is to use a “complete case” (CC) analysis which excludes subjects with

missing pre- or post-vaccination data. Although this approach is unbiased under

MCAR, it is inefﬁcient because it fails to utilize the information of the excluded

subjects. A better alternative is to use principled methods for longitudinal data

analysis such as “restricted maximum likelihood” (REML), “generalized estimating

equations” (GEE), or “multiple imputation” (MI), all of which are readily available

in standard software. The gains in efﬁciency of the latter approaches over the

complete case analysis can be signiﬁcant when the amount of missing data is

large. For example, there could be a substantial amount of missing post-boost

immunogenicity data in the case of an interim analysis of an ongoing trial when the

later enrolled subjects have received priming inoculations but not yet been boosted.

If the post-prime information from these subjects is not included in the interim

analysis, the analysis could be missing a large amount of available data. As a result,

it would be preferable to use an analytical approach that could actually incorporate

all the data.

Li et al. (in press) proposed a propensity score-based multiple imputation (MI)

method to tackle missing data in longitudinal clinical trials with binary responses,

with particular emphasis on vaccine immunogenicity trials. They noted three key

results. First, if data are missing completely at random, MI can be notably more

efﬁcient than the CC and GEE methods. Second, with small samples, GEE often

fails because of “convergence problems,” but MI is free of those problems. Finally,

if the data are missing at random, MI generally yields results with negligible bias,

while the CC and GEE methods yield results with moderate to large bias.

Wang et al. (2003) applied a longitudinal regression approach with a

“restricted” linear model structure (Liang and Zeger, 2000) to analyze vaccine

immunogenicity data; the restriction is that the population means for all treatment

groups are identical at baseline. This model allows for the comparison of

postvaccination immune responses between two vaccination groups, adjusting for

pre-vaccination immunogenicity levels in the presence of incomplete data (missing

either at pre- or post-vaccination). Under the assumption of equal population

means at baseline, which is justiﬁed in randomized trials, this longitudinal model

provides an unbiased estimate of the treatment effect while increasing the power for

the noninferiority/equivalence comparison as compared with the “complete case”

analysis. In the case of no missing data, estimates of treatment differences from this

longitudinal model are identical to those from the traditional analysis of covariance

(ANCOVA) model.

CONSIDERATIONS FOR NONINFERIORITY/EQUIVALENCE TRIALS 437

4. POWER CALCULATION FOR VACCINE

NONINFERIORITY/EQUIVALENCE TRIALS

There are many methods for calculating sample size and power for vaccine

noninferiority/equivalence trials. Here, we brieﬂy describe the methods based on the

commonly used asymptotic Z-test for noninferiority immunogenicity trials.

For bridging studies that are designed to test the noninferiority hypothesis

in Eq. (1) with respect to immune response rates, a commonly used method is the

sample size formula proposed by Farrington and Manning (1990), which is based

on the Z-type test statistic in Eq. (2). To have 1 −power to claim noninferiority at

the one-sided level, the approximate sample size for testing hypothesis in Eq. (1)

should be

NT=Z¯02 +Z122PT−PC+02and NC=uNT(6)

where uis the prespeciﬁed sample size ratio between the control and the test vaccine

groups, PCand PTare the expected immune response rates for the control and test

vaccine groups in the planned study, respectively, 12 is the value of the expression in

(2) when the constrained maximum likelihood estimates 

PTand 

PCare replaced with

PTand PC, respectively, and ¯02 is the value of ˜02 in Eq. (2) obtained when (

PT

PC

are calculated taking 

PT

PC=PTP

C. Similarly, the power of a one-sided level

ZDtest based on normal approximation is given by

1−=1

12

−Z¯02 +NTPT−PC+0(7)

where is the standard normal cumulative distribution function.

For bridging studies that are designed to test noninferiority hypothesis in

Eq. (1) with respect to GMCs, the sample size calculation is similar to that

for testing the conventional hypothesis of no difference under the log-normal

assumption. Let be the standard deviation of the log-transformed immune

responses. In order to achieve 1 −power for testing hypothesis in Eq. (1) at the

one-sided level, the sample size for the test vaccine group is

NT=1+1/uZ+Z22/logRGMC −02and NC=uNT(8)

where uis as speciﬁed earlier, and RGMC =GMCT/GMCCis the expected ratio

of GMCs between the new and control vaccine groups under the alternative

hypothesis. The power of a one-sided level test based on normal approximation is

given by

1−=Z+1

NT

1+1/ulogRGMC−0(9)

In order to demonstrate the stability of the vaccine manufacturing process,

consistency lot studies are often required by regulatory authorities on vaccine

products to show equivalence among three clinical lots. In such trials, there are

multiple pairwise comparisons and no closed-form formula exists for sample size

438 WANG ET AL.

calculation, but simulations can be used to adequately plan the sample size. For

power estimation, for example, for hypothesis testing with respect to immune

response rates, Wiens et al. (1996) recommended performing simulations under the

setup that the true immune response rates for the (typically) three lots are P1,P1,

and P1+05, instead of all being equal. This setup gives a conservative estimate of

power when the alternative hypothesis is true (Wiens et al., 1996).

In a combination or multivalent vaccine study, the hypotheses regarding each

component or each serotype are often considered to be co-primary. In planning

such a study, the sample size for each primary hypothesis can be calculated using

the formulas in Eq. (6), Eq. (8), or any other related sample size formulas for

the particular hypothesis in question. To address multiplicity concerns, a popular

approach is to plan the power for each primary hypothesis large enough such that

the overall power is acceptable under the assumption of independence among the

multiple hypotheses. For example, for a multivalent vaccine with kcomponents,

one can plan the power for the hypothesis testing on each component to be at

least 1 −/k so that the overall power is guaranteed to be no less than 1 −. This

approach controls the overall Type I error very stringently (much less than the

nominal level if each null is true), but it is rather conservative in that insigniﬁcance

for just one component implies a failure to reject the overall null hypothesis. This

approach is also conservative (and provides a lower bound for power) because

the correlation between the components is usually positive. Based on extensive

simulations, Kong et al. (2004) pointed out that trial designs with high power

(>80%) under the assumption of independence have only a modest increase in

power when the correlations between the components are taken into consideration.

At this point, there is no commonly accepted testing strategy to overcome the

conservatism of this popular approach, and further research is needed.

5. CONCLUDING REMARKS

Vaccine noninferiority/equivalence trials involve unique statistical issues when

compared with drug trials or other types of vaccine trials (e.g., vaccine efﬁcacy

trials). Unlike drug trials, these vaccine noninferiority trials generally aim to assess

the biological effect of a vaccine on healthy and inherently less heterogeneous study

population. Adherence in these trials is usually high (completion of the vaccination

series if more than one dose is administered) because there tend to be few drop-outs

due to adverse experiences. A per-protocol approach is often considered to be the

primary approach for vaccine noninferiority trials, rather than the intention-to-treat

approach that is used in drug studies.

In contrast with other types of vaccine trials, the conclusions drawn from

noninferiority/equivalence vaccine trials are highly dependent on the identiﬁcation

of appropriate correlates of protection and on the measurement of immune

responses by properly validated immunogenicity assays. In this paper, we have

focused on parametric inferential methods and related issues for establishing the

equivalence/noninferiority of immune response rates and geometric means of

immune responses. In addition to these parametric analyses, the comparability of

immune responses to vaccinations is commonly illustrated by graphical displays

of the reverse cumulative distribution curves of PrX ≥x. These curves give

the percentages of participants with immune responses greater than or equal to

CONSIDERATIONS FOR NONINFERIORITY/EQUIVALENCE TRIALS 439

varying ﬁxed levels x. Nonparametric methods have also been proposed by Stine

and Heyse (2001) to estimate the overlap or proportion of a similar response in

distributions, which can be used to measure the similarity between two distributions

of immune responses. In addition, it should be noted that, even though this paper

has focused on the most common types of vaccine noninferiority/equivalence trials

where the primary endpoints are immunogenicity related, the statistical techniques

described here can be used for efﬁcacy and safety related endpoints as well

(Chan et al., 2003; Chan, 2003).

From a statistical research perspective, vaccine noninferiority/equivalence

trials are a fertile area of investigation. Each of the four aforementioned common

types of noninferiority/equivalence immunogenicity trials offer unique statistical

challenges. Further statistical research is needed in establishing immunogenicity

markers, selecting immunogenicity endpoints, choosing noninferiority margins,

and ﬁnding optimal statistical strategies (e.g., possible applications of Bayesian

techniques) to establish noninferiority/equivalence. Due to the emerging need of

multivalent combination vaccines to simplify vaccination schedules, the multiplicity

issue should also be a priority area for future research. The use of composite

endpoints or gate-keeping strategies may be useful in this area. The ultimate goal is

to have a readily available statistical toolkit that can be used to minimize the risk

of abandoning potentially beneﬁcial vaccines or licensing inferior ones.

ACKNOWLEDGMENTS

We thank the referees for their thoughtful comments that led to an improved

manuscript.

REFERENCES

Agresti, A., Min, Y. (2001). On small-sample conﬁdence intervals for parameters in discrete

distributions. Biometrics 57:963–971.

Blackwelder, W. C. (1982). Proving the null hypothesis in clinical trials. Controlled Clinical

Trials 3:345–353.

Blackwelder, W. C. (1995). Similarity/equivalence trials for combination vaccine. Williams,

J. C.; Goldenthal, K. L., Burns, D. L., Lewis, B. P. eds. Combined Vaccines and

Simultaneous Administration. Ann. New York Acad. Sciences, 754:321–328.

Chan, I. S. F. (1998). Exact tests of equivalence and efﬁcacy with a non-zero lower bound

for comparative studies. Statistics in Medicine 17:1403–1413.

Chan, I. S. F. (2002). Power and sample size determination for noninferiority trials using an

exact method. Journal of Biopharmaceutical Statistics 12(4):457–469.

Chan, I. S. F. (2003). Proving noninferiority or equivalence of two treatments with

dichotomous endpoints using exact methods, Statistical Methods in Medical Research

12:37–58.

Chan, I. S. F., Mehrotra, D. V. (2003). Conﬁdence intervals and hypothesis testing.

Encyclopedia of Biopharmaceutical Statistics, 2nd ed. New York: Marcel Dekker,

pp. 231–234.

Chan, I. S. F., Wang, W. W. B., Heyse, J. F. (2003). Vaccine clinical trials. Encyclopedia of

Biopharmaceutical Statistics. Chow, S. C. eds. 2nd ed. New York: Marcel Dekker, Inc.

pp. 1005–1022.

Chan, I. S. F., Zhang, Z. X. (1999). Test-based exact conﬁdence intervals for the difference

of two binomial proportions. Biometrics 55:1202–1209.

440 WANG ET AL.

Dunnett, C. W., Gent, M. (1977). Signiﬁcance testing to establish equivalence between

treatments with special reference to data in the form of 2 ×2 tables. Biometrics

33:593–602.

Ebbutt, A. F., Frith, L. (1999). Practical issues in equivalence trials. Stat. in Med. 1691–1701.

European Medicines Agency (EMEA). (2005). Committee for Medicinal Products for

Human Uses (CHMP), Guideline on the choice of noninferiority margin. July.

Farrington, C. P., Manning, G. (1990). Test statistics and sample size formulae for

comparative binomial trials with null hypothesis of non-zero risk difference or non-

unity relative risk. Statistics in Medicine 9:1447–1454.

FDA. (1997). Guidance for industry for the evaluation of combination vaccines for

preventable diseases: production, testing and clinical studies. Center for Biologics

Evaluation and Research, Food and Drug Administration.

Frey et al. (1999). Interference of antibody production to hepatitis B surface antigen in a

combination hepatitis A/hepatitis B vaccine. J. Infect. Dis. 180(6):2018–2022.

Gail, M., Simon, R. (1985). Testing for qualitative interactions between treatment effects

and patient subsets. Biometrics 41:361–372.

Gail, M. H., Mark, S. D., Carroll, R. J., Green, S. B., Pee, D. (1996). On design

considerations and randomization-based inference for community intervention trials.

Statistics in Medicine 15:1069–1092.

Ganju, J., Mehrotra, D. V. (2003). Stratiﬁed experiments re-examined with emphasis on

multicenter clinical trials. Controlled Clinical Trials 24:167–181.

Horne, A. D. (1995). The statistical analysis of immunogenicity data in vaccine trials: a

review of methodologies and issues. Ann. N Y Acad. Sci. 754:329–46.

Horne, A. D., Lachenbruch, P. A., Getson, P. R., Hsu, H. (2001). Analysis of studies

to evaluate immune response to combination vaccines. Clinical Infectious Diseases

33(Suppl 4):S306–11.

ICH E9 Expert Working Group. (1999). Statistical principles for clinical trials: ICH

harmonized tripartite guidelines. Statistics in Medicine 18:1905–1942.

ICH E10 Guideline. (2000). Choice of control group and related issues in clinical trials.

International Conference on Harmonization (ICH), July.

Kong, L., Kohberger, R. C., Koch, G. G. (2004). Type I error and power in

noninferiority/equivalence trials with correlated endpoints: an example from vaccine

development trials. J. of Biopharmaceutical Statistics 14:893–907.

Li, X., Mehrotra, D. V., Barnard, J. (in press). Analysis of incomplete longitudinal binary

data using multiple imputation. Statistics in Medicine.

Liang, K., Zeger, S. (2000). Longitudinal data analysis of continuous and discrete response

for pre-post designs. Sankhya 62:134–138.

Mehrotra, D. V. (2001). Stratiﬁcation issues with binary endpoints. Drug Information Journal

35(4):1343–1350.

Mehrotra, D. V. (2002) Stratiﬁed comparative clinical trials—analysis and interpretation

issues. Proceeding of International Biometrics Conference.

Mehrotra, D. V. Vaccine clinical trials – A statistical primer. Journal of Biopharmaceutical

Statistics (in press).

Mehrotra, D. V., Chan, I. S. F. (2000). Testing for treatment by stratum interaction in

stratiﬁed comparative binomial trial. Joint Statistical Meetings. August.

Mehrotra, D. V., Chan, I. S. F., Berger, R. L. (2003). A cautionary note on

exact unconditional inference for a difference between two independent binomial

proportions. Biometrics 59:441–450.

Mehrotra, D. V., Railkar, R. (2000). Minimum risk weights for comparing treatments in

stratiﬁed binomial trials. Statistics in Medicine 19:811–825.

Miettinen, O, Nurminen, M. (1985). Comparative analysis of two rates. Statistics in Medicine

4:213–226.

CONSIDERATIONS FOR NONINFERIORITY/EQUIVALENCE TRIALS 441

Pan, G. H., Wolfe, D. A. (1997). Test for qualitative interaction of clinical signiﬁcance.

Statistics in Medicine 16:1645–1652.

Plikaytis, B. D., Carlone, P. M. (2004). Statistical considerations for vaccine immunogenicity

trials: Part 1 and Part 2, Vaccine 23(13):1596–1614.

Schuirmann, D. J. (1987). A comparison of the two one-sided tests procedure and the

power approach for assessing the equivalence of average bioavailability. Journal of

Pharmacokinetics and Biopharmaceutics 15:657–680.

Stine, R. A., Heyse, J. F. (2001). Nonparametric measures of overlap. Stat in Med 20:215–236.

Temple, R. (1996). Problems in interpreting active control equivalence trials. Accountability

in Research 4:267–275.

Wang, W. W. B, Li, D., Liu, F., Chan, I. S. F. (2003). Analysis of immune responses in

vaccine clinical trials with a pre-post design. Joint Statistical Meeting. San Francisco,

August.

Wiens, B. L. Heyse, J. F. (2003). Testing for interaction in studies of noninferiority. Journal

of Biopharmaceutical Statistics 13:103–115.

Wiens, B. L., Heyse, J. F., Matthews, H. (1996). Similarity of three treatments, with

application to vaccine development. Proceedings of the Biopharmaceutical Section, Joint

Statistical Meeting 203–206.

Evaluation of immune response to single dose of quadrivalent HPV vaccine at 10-year post-vaccination

Article

Full-text available

Nov 2022
VACCINE

Background The recent World Health Organization recommendation supporting single-dose of HPV vaccine will significantly reduce programmatic cost, mitigate the supply shortage, and simplify logistics, thus allowing more low- and middle-income countries to introduce the vaccine. From a programmatic perspective the durability of protection offered by a single-dose will be a key consideration. The primary objectives of the present study were to determine whether recipients of a single-dose of quadrivalent HPV vaccine had sustained immune response against targeted HPV types (HPV 6,11,16,18) at 10 years post-vaccination and whether this response was superior to the natural antibody titres observed in unvaccinated women. Methods Participants received at age 10–18 years either one, two or three doses of the quadrivalent HPV vaccine. Serology samples were obtained at different timepoints up to 10 years after vaccination from a convenience sample of vaccinated participants and from age-matched unvaccinated women at one timepoint. The evolution of the binding and neutralizing antibody response was presented by dose received. 10-year durability of immune responses induced by a single-dose was compared to that after three doses of the vaccine and in unvaccinated married women. Results The dynamics of antibody response among the single-dose recipients observed over 120 months show stabilized levels 18 months after vaccination for all four HPV types. Although the HPV type-specific (binding or neutralizing) antibody titres after a single-dose were significantly inferior to those after three doses of the vaccine (lower bounds of GMT ratios < 0.5), they were all significantly higher than those observed in unvaccinated women following natural infections (GMT ratios: 2.05 to 4.04-fold higher). The results correlate well with the high vaccine efficacy of single-dose against persistent HPV 16/18 infections reported by us earlier at 10-years post-vaccination. Conclusion Our study demonstrates the high and durable immune response in single-dose recipients of HPV vaccine at 10-years post vaccination.

Innovative trial designs and analyses for vaccine clinical development

Article

Jan 2021
CONTEMP CLIN TRIALS

In the past decades, the world has experienced several major virus outbreaks, e.g. West African Ebola outbreak, Zika virus in South America and most recently global coronavirus (COVID-19) pandemic. Many vaccines have been developed to prevent a variety of infectious diseases successfully. However, several infections have not been preventable so far, like COVID-19, which induces an immediate urgent need for effective vaccines. These emerging infectious diseases often pose unprecedent challenges for the global heath community as well as the conventional vaccine development paradigm. With a long and costly traditional vaccine development process, there are extensive needs in innovative vaccine trial designs and analyses, which aim to design more efficient vaccines trials. Featured with reduced development timeline, less resource consuming or improved estimate for the endpoints of interests, these more efficient trials bring effective medicine to target population in a faster and less costly way. In this paper, we will review a few vaccine trials equipped with adaptive design features, Bayesian designs that accommodate historical data borrowing, the master protocol strategy emerging during COVID-19 vaccine development, Real-World-Data (RWD) embedded trials and the correlate of protection framework and relevant research works. We will also discuss some statistical methodologies that improve the vaccine efficacy, safety and immunogenicity analyses. Innovative clinical trial designs and analyses, together with advanced research technologies and deeper understanding of the human immune system, are paving the way for the efficient development of new vaccines in the future.

Equivalent immunogenicity across three RSVpreF vaccine lots in healthy adults 18-49 years of age: Results of a randomized phase 3 study

Article

Apr 2024

Immunogenicity of 2 or 3 Doses of 9vHPV Vaccine in U.S. Female Individuals 15 to 26 Years of Age

Article

Jan 2024

Immunogenicity of Two or Three Doses of 9vHPV VaccineThis noninferiority trial examined two versus three doses of 9-valent human papillomavirus (9vHPV) vaccine in individuals 15 to 26 years of age in the United States. In an unplanned interim analysis of female participants, two doses of 9vHPV vaccine appeared to elicit similar rates of seroconversion and antibody titers for each of the nine HPV genotypes to three doses at 1 month postvaccination.

A dynamic power prior approach to non-inferiority trials for normal means

Article

Nov 2023
PHARM STAT

Non‐inferiority trials compare new experimental therapies to standard ones (active control). In these experiments, historical information on the control treatment is often available. This makes Bayesian methodology appealing since it allows a natural way to exploit information from past studies. In the present paper, we suggest the use of previous data for constructing the prior distribution of the control effect parameter. Specifically, we consider a dynamic power prior that possibly allows to discount the level of borrowing in the presence of heterogeneity between past and current control data. The discount parameter of the prior is based on the Hellinger distance between the posterior distributions of the control parameter based, respectively, on historical and current data. We develop the methodology for comparing normal means and we handle the unknown variance assumption using MCMC. We also provide a simulation study to analyze the proposed test in terms of frequentist size and power, as it is usually requested by regulatory agencies. Finally, we investigate comparisons with some existing methods and we illustrate an application to a real case study.

Unequal allocation of sample/event sizes with considerations of sampling cost for testing equality, non-inferiority/superiority, and equivalence of two Poisson rates

Article

Dec 2022

For non-inferiority/superiority and equivalence tests of two Poisson rates, the determination of the required number of sample sizes has been studied but the studies for the number of events to be observed are very limited. To fill the gap, the present study first is aimed toward determining the number of events to be observed for testing non-inferiority/superiority and equivalence of two Poisson rates, respectively. Also, considering the cost for each event, the second purpose is to apply an exhaustive search to find the unequal but optimal allocation of events for each group such that the budget is minimal for a user-specified power level, or the statistical power is maximal for a user-specified budget. Four R Shiny apps were developed to obtain the number of events needed for each group. A simulation study showed the proposed approach to be valid in terms of Type I error and statistical power. A comparison of the proposed approach with extant methods from various disciplines was performed, and an illustrative example of comparing the adverse reactions to the COVID-19 vaccines was demonstrated. By applying the proposed approach, researchers also can estimate the most economical number of subjects or time intervals after determining the number of events.

Assessing the Impacts of Cluster Effects and Covariate Imbalance in Cluster Randomized Equivalence Trials

Article

May 2022

Equivalence tests establish whether treatments are similar in their intended outcomes. This is in contrast to superiority tests, which establish whether a new treatment is better than a standard treatment or placebo. Few equivalence trials have employed a cluster randomized design, but they are subject to some of the same analysis pitfalls that are common to superiority trials—namely, a failure to adjust for either cluster effects or covariate imbalances resulting from randomization. Using real and simulated data from a cluster randomized trial comparing exercise protocols among U.S. Army soldiers, this study empirically demonstrates the consequences for power and Type I error rates when either or both of these effects have been ignored in the analysis. Analysis of real trial data showed that equivalence test outcomes can change depending on whether appropriate adjustments are applied. Simulations demonstrated that failing to adjust for important baseline covariates severely reduces statistical power, and failing to adjust for cluster effects increases the risk of false declarations of equivalence. As cluster randomized designs are increasingly employed for equivalence trials, analysts must be aware of the importance of adjusting for cluster effects and covariate imbalances to avoid false conclusions.

Borrowing historical information for non-inferiority trials on Covid-19 vaccines

Article

Apr 2022

Non-inferiority vaccine trials compare new candidates to active controls that provide clinically significant protection against a disease. Bayesian statistics allows to exploit pre-experimental information available from previous studies to increase precision and reduce costs. Here, historical knowledge is incorporated into the analysis through a power prior that dynamically regulates the degree of information-borrowing. We examine non-inferiority tests based on credible intervals for the unknown effects-difference between two vaccines on the log odds ratio scale, with an application to new Covid-19 vaccines. We explore the frequentist properties of the method and we address the sample size determination problem.

The SARS-CoV-2 mutations versus vaccine effectiveness: New opportunities to new challenges

Article

Full-text available

Jan 2022

Background The SARS-CoV-2 coronavirus epidemic is hastening the discovery of the most efficient vaccines. The development of cost-effective vaccines seems to be the only solution to terminate this Pandemic. However, the vaccines’ effectiveness has been questioned due to recurrent mutations in the SARS-CoV-2 genome. Most of the mutations are associated with the spike protein, a vital target for several marketed vaccines. Many Countries were highly affected by the 2nd wave of the SARS-CoV-2, like the UK, India, Brazil, France. Experts are also alarming the further COVID-19 wave with the emergence of Omicron, which is highly affecting the South African populations. This review encompasses the detailed description of all vaccine candidates and COVID-19 mutants that will add value to design further studies to combat the COVID-19 Pandemic. Methods The information was generated using various search engines like google scholar, PubMed, clinicaltrial.gov.in, WHO database, ScienceDirect, and news portals by using keywords SARS-CoV-2 Mutants, COVID-19 Vaccines, Efficacy of SARS-CoV-2 Vaccines, COVID-19 waves. Results This review has highlighted the evolution of SARS-CoV2 variants and the vaccine efficacy. Currently, various vaccine candidates are also undergoing several phases of development. Their efficacy still needs to check for newly emerged variants. We have focused on the evolution, multiple mutants, waves of the SARS-CoV-2, and different marketed vaccines undergoing various clinical trials and the design of the trials to determine vaccine efficacy. Conclusion Various mutants of SARS-CoV-2 arrived, mainly concerned with the spike protein, a key component to design the vaccine candidates. Various vaccines are undergoing clinical trial and show impressive results, but their efficacy still needs to be checked in different SARS-CoV-2 mutants. We discussed all mutants of SARS-CoV-2 and the vaccine's efficacy against them. The safety concern of these vaccines is also discussed. It is important to understand how coronavirus gets mutated to design better new vaccines, providing long-term protection and neutralizing broad mutant variants. A proper study approach also needs to be considered while designing the vaccine efficacy trials, which further improved the study outcomes. Taking preventive measures to protect from the virus is also equally important, like vaccine development.

Efficacy of an iodoform-based filling material for pulpectomy of primary teeth: A 24-month non-inferiority randomized clinical trial

Article

Dec 2021

Aim The aim of this non-inferiority randomized clinical trial was to compare the efficacy of an iodoform-based paste (Guedes-Pinto paste - GP) as a filling material in pulpectomies of primary teeth, and a standard material composed by calcium Hydroxide and iodoform (CaOH/Iodof paste; Vitapex®). Design 104 teeth from 61 children (3 to 8 years old) were randomly allocated to two groups according to filling materials. Children were followed for 24 months. The primary endpoint was the treatment success rate evaluated through clinical and radiographic examination at follow-up. The canal filling quality was analysed as secondary outcome. Differences in the proportion of treatment success was calculated with the Miettinen and Nurminen’s method in the intention-to-treat population, considering a -20% of non-inferiority limit. Results From 104 randomized teeth, 102 were followed after 24 months (attrition rate of 1.9%). The success rate (95% confidence intervals – 95%CI) of teeth treated with GP paste was 86.8% (69.9 to 94.9%) and 78.4% (61.8 to 89.1%) with CaOH/Iodof paste. Consequently, a non-inferiority of GP paste was observed when compared to CaOH/Iodof paste (p < 0.001). Conclusion GP paste has a non-inferior success rate against CaOH/Iodof paste used as filling material for pulpectomy in primary teeth.

Stratification Issues with Binary Endpoints

Article

Full-text available

Oct 2001

Devan V Mehrotra

This note addresses four interrelated issues for a stratified comparative trial with a binary endpoint: 1. How to define the true overall treatment effect parameter, 2. How the strata should be weighted when conducting inference and estimation involving the overall treatment effect, 3. How to (and how not to) test for a treatment by stratum (T × S) interaction, and 4. When, why, and how the outcome of the T × S test should influence the weights assigned to each stratum. Numerical examples are provided to reinforce the key points.

Comparative analysis of two rates

Article

Full-text available

Apr 1985
STAT MED

In this paper, we examine comparative analysis of rates with a view to each of the usual comparative parameters-rate difference (RD), rate ratio (RR) and odds ratio (OR)-and with particular reference to first principles. For RD and RR we show the prevailing statistical practices to be rather poor. We stress the need for restricted estimation of variance in the chi-square function underlying interval estimation (and also point estimation and hypothesis testing). For RR analysis we propose a chi-square formulation analogous to that for RD and, thus, one which obviates the present practice of log transformation and its associated use of Taylor series approximation of the variance. As for OR analysis, we emphasize that the chi-square function, introduced by Cornfield for unstratified data, and extended by Gart to the case of stratified analysis, is based on the efficient score and thus embodies its optimality properties. We provide simulation results to evince the better performance of the proposed (parameter-constrained) procedures over the traditional ones.

Minimum risk weights for comparing treatments in stratified binomial trials

Article

Mar 2000
STAT MED

When comparing two treatments in a stratified trial with a binary endpoint, data are commonly analysed using a weighted averaging of the stratum-specific differences between proportions. Two popular sets of weights are the harmonic means of the stratum-specific sample sizes (SSIZE) and the reciprocals of the variances of the stratum-specific differences (INVAR). Either the SSIZE or INVAR weights are chosen and prespecified in the data analysis plan. We show that the ‘wrong’ choice between SSIZE and INVAR can result in a significantly inefficient analysis. To circumvent this potential problem, we propose a ‘minimum risk’ (MR) weighting strategy. The easy-to-compute MR weights are designed to yield more precise and less biased estimates of the overall treatment difference relative to the SSIZE and INVAR weights, respectively. We show, via a simulation study, that the proposed weights are an attractive compromise between the SSIZE and INVAR weights in terms of statistical power. Numerical examples are presented to illustrate the utility of the MR weights. Copyright © 2000 John Wiley & Sons, Ltd.

ON DESIGN CONSIDERATIONS AND RANDOMIZATION-BASED INFERENCE FOR COMMUNITY INTERVENTION TRIALS

Article

Jun 1996
STAT MED

This paper discusses design considerations and the role of randomization-based inference in randomized community intervention trials. We stress that longitudinal follow-up of cohorts within communities often yields useful information on the effects of intervention on individuals, whereas cross-sectional surveys can usefully assess the impact of intervention on group indices of health. We also discuss briefly special design considerations, such as sampling cohorts from targeted subpopulations (for example, heavy smokers), matching the communities, calculating sample size, and other practical issues. We present randomization tests for matched and unmatched cohort designs. As is well known, these tests necessarily have proper size under the strong null hypothesis that treatment has no effect on any community response. It is less well known, however, that the size of randomization tests can exceed nominal levels under the ‘weak’ null hypothesis that intervention does not affect the average community response. Because this weak null hypothesis is of interest in community intervention trials, we study the size of randomization tests by simulation under conditions in which the weak null hypothesis holds but the strong null hypothesis does not. In unmatched studies, size may exceed nominal levels under the weak null hypothesis if there are more intervention than control communities and if the variance among community responses is larger among control communities than among intervention communities; size may also exceed nominal levels if there are more control than intervention communities and if the variance among community responses is larger among intervention communities. Otherwise, size is likely near nominal levels. To avoid such problems, we recommend use of the same numbers of control and intervention communities in unmatched designs. Pair-matched designs usually have size near nominal levels, even under the weak null hypothesis. We have identified some extreme cases, unlikely to arise in practice, in which even the size of pair-matched studies can exceed nominal levels. These simulations, however, tend to confirm the robustness of randomization tests for matched and unmatched community intervention trials, particularly if the latter designs have equal numbers of intervention and control communities. We also describe adaptations of randomization tests to allow for covariate adjustment, missing data, and application to cross-sectional surveys. We show that covariate adjustment can increase power, but such power gains diminish as the random component of variation among communities increases, which corresponds to increasing intraclass correlation of responses within communities. We briefly relate our results to model-based methods of inference for community intervention trials that include hierarchical models such as an analysis of variance model with random community effects and fixed intervention effects. Although we have tailored this paper to the design of community intervention trials, many of the ideas apply to other experiments in which one allocates groups or clusters of subjects at random to intervention or control treatments.

Statistical principles for clinical trials (ICH E9): an introductory note on an international guideline

Article

Aug 1999
STAT MED

John A. Lewis

ON DESIGN CONSIDERATIONS AND RANDOMIZATION‐BASED INFERENCE FOR COMMUNITY INTERVENTION TRIALS

Article

Jun 1996

Non-parametric estimates of overlap

Article

Jan 2001

Kernel densities provide accurate non-parametric estimates of the overlapping coefficient or the proportion of similar responses (PSR) in two populations. Non-parametric estimates avoid strong assumptions on the shape of the populations, such as normality or equal variance, and possess sampling variation approaching that of parametric estimates. We obtain accurate standard error estimates by bootstrap resampling. We illustrate the practical use of these methods in two examples and use simulations to explore the properties of the estimators under various sampling situations. Copyright (C) 2001 John Wiley & Sons, Ltd.

The Statistical Analysis of Immunognicity Data in Vaccine atials

Article

May 1995

Amelia Dale Horne

Longitudinal Data Analysis of Continuous and Discrete Responses for Pre-Post Designs

Article

Jan 2000

The availability of biomarkers has led to the increasing adoption of the pre- and post-randomization designs in clinical trials. In this paper we discuss the use of random effects models when the primary objective of the trial is to assess the efficacy of newly developed treatments in terms of progression of biomarkers over time. One issue of particular interest is how to best utilize the pre-randomization responses. We discuss the pros and cons of several inferential procedures including the conditional and full likelihood approaches. Throughout, these issues are illustrated by an analysis of data from a schizophrenic trial by the Janssen Research Foundation.

On design considerations and randomisation-based inference for community trials

Article

Jun 1996
STAT MED

Statistical Considerations for NonInferiority/Equivalence Trials in Vaccine Development

Abstract and Figures

Recommended publications

Design of Vaccine Equivalence/Non-Inferiority Trials with Correlated Multiple Binomial Endpoints

Review of Statistical Innovations in Trials Supporting Vaccine Clinical Development

Vaccine Clinical Trials – A Statistical Primer

Evaluation of serogroup A meningococcal vaccines in Africa: A demonstration project