Non-proportional effect size in group sequential design

Overview

The acronym NPES is short for non-proportional effect size. While it is motivated primarily by a use for when designing a time-to-event trial under non-proportional hazards (NPH), we have simplified and generalized the concept here. The model is likely to be useful for rank-based survival tests beyond the logrank test that will be considered initially by Tsiatis (1982). It could also be useful in other situations where treatment effect may vary over time in a trial for some reason. We generalize the framework of Chapter 2 of Proschan et al. (2006) to incorporate the possibility of the treatment effect changing during the course of a trial in some systematic way. This vignettes addresses distribution theory and initial technical issues around computing

boundary crossing probabilities
bounds satisfying targeted boundary crossing probabilities

This is then applied to generalize computational algorithms provided in Chapter 19 of Jennison and Turnbull (1999) that are used to compute boundary crossing probabilities as well as boundaries for group sequential designs. Additional specifics around boundary computation, power and sample size are provided in a separate vignette.

The probability model

The continuous model and E-process

We consider a simple example here to motivate distribution theory that is quite general and applies across many situations. For instance, Proschan et al. (2006) immediately suggest paired observations, time-to-event and binary outcomes as endpoints where the theory is applicable.

We assume for a given integer \(N>0\) that \(X_{i}\) are independent, \(i=1,2,\ldots\). For some integer \(K\leq N\) we assume we will perform analysis \(K\) times after \(0<n_1<n_2,\ldots ,n_K = N\) observations are available for analysis. Note that we have not confined \(n\leq N\), but \(N\) can be considered the final planned sample size. Proschan et al. (2006) refer to the estimation or E-process which we extend here to

\[\hat{\theta}_k = \frac{\sum_{i=1}^{n_k} X_{i}}{n_k}\equiv \bar X_{k}.\] While Proschan et al. (2006) has used \(\delta\) instead of \(\theta\) in our notation, we stick more closely to the notation of Jennison and Turnbull (1999) where \(\theta\) is used. For our example, we see \(\hat{\theta}_k\equiv\bar X_k\) represents the sample average at analysis \(k\), \(1\leq k\leq K\). With a survival endpoint, \(\hat\theta_k\) would typically represent a Cox model coefficient representing the logarithm of the hazard ratio for experimental vs control treatment and \(n_k\) would represent the planned number of events at analysis \(k\), \(1\leq k\leq K.\) Denoting \(t_k=n_k/N\), we assume that for some real-valued function \(\theta(t)\) for \(t \geq 0\) we have for \(1\leq k\leq K\)

\[E(\hat{\theta}_k) =\theta(t_k) =E(\bar X_k).\] In the models of Proschan et al. (2006) and Jennison and Turnbull (1999) we would have \(\theta(t)\) equal to some constant \(\theta\). We assume further that for \(i=1,2,\ldots\) \[\text{Var}(X_{i})=1.\] The sample average variance under this assumption is for \(1\leq k\leq K\)

\[\text{Var}(\hat\theta(t_k))=\text{Var}(\bar X_k) = 1/ n_k.\] The statistical information for the estimate \(\hat\theta(t_k)\) for \(1\leq k\leq K\) for this case is \[ \mathcal{I}_k \equiv \frac{1}{\text{Var}(\hat\theta(t_k))} = n_k.\] We now see that \(t_k\), \(1\leq k\leq K\) is the so-called information fraction at analysis \(k\) in that \(t_k=\mathcal{I}_k/\mathcal{I}_K.\)

Z-process

The Z-process is commonly used (e.g., Jennison and Turnbull (1999)) and will be used below to extend the computational algorithm in Chapter 19 of Jennison and Turnbull (1999) by defining equivalently in the first and second lines below for \(k=1,\ldots,K\)

\[Z_{k} = \frac{\hat\theta_k}{\sqrt{\text{Var}(\hat\theta_k)}}= \sqrt{\mathcal{I}_k}\hat\theta_k= \sqrt{n_k}\bar X_k.\]

The variance for \(1\leq k\leq K\) is \[\text{Var}(Z_k) = 1\] and the expected value is

\[E(Z_{k})= \sqrt{\mathcal{I}_k}\theta(t_{k})= \sqrt{n_k}E(\bar X_k) .\]

B-process

B-values are mnemonic for Brownian motion. For \(1\leq k\leq K\) we define \[B_{k}=\sqrt{t_k}Z_k\] which implies \[ E(B_{k}) = \sqrt{t_{k}\mathcal{I}_k}\theta(t_k) = t_k \sqrt{\mathcal{I}_K} \theta(t_k) = \mathcal{I}_k\theta(t_k)/\sqrt{\mathcal{I}_K}\] and \[\text{Var}(B_k) = t_k.\]

For our example, we have

\[B_k=\frac{1}{\sqrt N}\sum_{i=1}^{n_k}X_i.\] It can be useful to think of \(B_k\) as a sum of independent random variables.

Summary of E-, Z- and B-processes

Statistic	Example	Expected value	Variance
\(\hat\theta_k\)	\(\bar X_k\)	\(\theta(t_k)\)	\(\mathcal{I}_k^{-1}\)
\(Z_k=\sqrt{\mathcal{I}_k}\hat\theta_k\)	\(\sqrt{n_k}\bar X_k\)	\(\sqrt{\mathcal{I}_k}\theta(t_k)\)	\(1\)
\(B_k=\sqrt{t_k}Z_k\)	\(\sum_{i=1}^{n_k}X_i/\sqrt N\)	\(t_k\sqrt{\mathcal{I}_K}\ \theta(t_k)=\mathcal{I}_k\ \theta(t_k)/\sqrt{\mathcal{I}_K}\)	\(t_k\)

Conditional independence, covariance and canonical form

We assume independent increments in the B-process. That is, for \(1\leq j < k\leq K\) \[\tag{1} B_k - B_j \sim \text{Normal} (\sqrt{\mathcal{I}_K}(t_k\theta(t_k)- t_j\theta(t_j)), t_k-t_j)\] independent of \(B_1,\ldots,B_j\). As noted above, for a given \(1\leq k\leq K\) we have for our example \[B_j=\sum_{i=1}^{n_j}X_i / \sqrt N.\] Because of independence of the sequence \(X_i\), \(i=1,2,\ldots\), we immediately have for \(1\leq j\leq k\leq K\) \[\text{Cov}(B_j,B_k) = \text{Var}(B_j) = t_j.\] This leads further to \[\text{Corr}(B_j,B_k)=\frac{t_j}{\sqrt{t_jt_k}}=\sqrt{t_j/t_k}=\text{Corr}(Z_j,Z_k)=\text{Cov}(Z_j,Z_k)\] which is the covariance structure in the so-called canonical form of Jennison and Turnbull (1999). For our example, we have \[B_k=\frac{1}{\sqrt N}\sum_{i=1}^{n_k}X_i\] and \[B_k-B_j=\frac{1}{\sqrt N}\sum_{i=n_j + 1}^{n_k}X_i\] and the covariance is obvious. We assume independent increments in the B-process that will be demonstrated for the simple example above. That is, for \(1\leq j < k\leq K\) \[\tag{1} B_k - B_j \sim \text{Normal} (\mathcal{I}_k\theta(t_k)- \mathcal{I}_j\theta(t_j), t_k-t_j)\] independent of \(B_1,\ldots,B_j\). For a given \(1\leq j\leq k\leq K\) we have for our example \[B_j=\sum_{i=1}^{n_j}X_i / (\sqrt N\sigma).\] Because of independence of the sequence \(X_i\), \(i=1,2,\ldots\), we immediately have for \(1\leq j\leq k\leq K\) \[\text{Cov}(B_j,B_k) = \text{Var}(B_j) = t_j/t_k =\mathcal{I}_j/\mathcal{I}_k.\] This leads to \[\mathcal{I}_j/\mathcal{I}_k=\sqrt{t_j/t_k}=\text{Corr}(B_j,B_k)=\text{Corr}(Z_j,Z_k)=\text{Cov}(Z_j,Z_k)\] which is the covariance structure in the so-called canonical form of Jennison and Turnbull (1999). The independence of \(B_j\) and \[B_k-B_j=\sum_{i=n_j + 1}^{n_k}X_i/(\sqrt N\sigma)\] is obvious for this example.

Test bounds and crossing probabilities

In this section we define notation for bounds and boundary crossing probabilities for a group sequential design. We also define an algorithm for computing bounds based on a targeted boundary crossing probability at each analysis. The notation will be used elsewhere for defining one- and two-sided group sequential hypothesis testing. A value of \(\theta(t)>0\) will reflect a positive benefit.

For \(k=1,2,\ldots,K-1\), interim cutoffs \(-\infty \leq a_k< b_k\leq \infty\) are set; final cutoffs \(-\infty \leq a_K\leq b_K <\infty\) are also set. An infinite efficacy bound at an analysis means that bound cannot be crossed at that analysis. Thus, \(3K\) parameters define a group sequential design: \(a_k\), \(b_k\), and \(\mathcal{I}_k\), \(k=1,2,\ldots,K\).

Notation for boundary crossing probabilities

We now apply the above distributional assumptions to compute boundary crossing probabilities. We use a shorthand notation in this section to have \(\theta\) represent \(\theta()\) and \(\theta=0\) to represent \(\theta(t)\equiv 0\) for all \(t\). We denote the probability of crossing the upper boundary at analysis \(k\) without previously crossing a bound by

\[\alpha_{k}(\theta)=P_{\theta}(\{Z_{k}\geq b_{k}\}\cap_{j=1}^{i-1}\{a_{j}\leq Z_{j}< b_{j}\}),\] \(k=1,2,\ldots,K.\)

Next, we consider analogous notation for the lower bound. For \(k=1,2,\ldots,K\) denote the probability of crossing a lower bound at analysis \(k\) without previously crossing any bound by

\[\beta_{k}(\theta)=P_{\theta}((Z_{k}< a_{k}\}\cap_{j=1}^{k-1}\{ a_{j}\leq Z_{j}< b_{j}\}).\] For symmetric testing for analysis \(k\) we would have \(a_k= - b_k\), \(\beta_k(0)=\alpha_k(0),\) \(k=1,2,\ldots,K\). The total lower boundary crossing probability for a trial is denoted by \[\beta(\theta)\equiv\sum_{k=1}^{K}\beta_{k}(\theta).\] Note that we can also set \(a_k= -\infty\) for any or all analyses if a lower bound is not desired, \(k=1,2,\ldots,K\). For \(k<K\), we can set \(b_k=\infty\) where an upper bound is not desired. Obviously, for each \(k\), we want either \(a_k>-\infty\) or \(b_k<\infty\).

References

Jennison, Christopher, and Bruce W Turnbull. 1999. Group Sequential Methods with Applications to Clinical Trials. Chapman & Hall/CRC.

Proschan, Michael A., K. K. Gordon Lan, and Janet Turk Wittes. 2006. Statistical Monitoring of Clinical Trials: A Unified Approach. Springer.

Tsiatis, Anastasios A. 1982. “Repeated Significance Testing for a General Class of Statistics Used in Censored Survival Analysis.” Journal of the American Statistical Association 77 (380): 855–61.

Keaven M. Anderson