Computing spending boundaries in group sequential design
Keaven Anderson and Yujie Zhao
Source:vignettes/articles/storyseventesttypes.Rmd
storyseventesttypes.Rmd
Introduction
We compare derivation of different spending bounds using the
gsDesign2 and gsDesign packages. In gsDesign, there are 6 types of
bounds. We demonstrate here how to replicate these using gsDesign2. In
gsDesign2, the gs_spending_bound()
function can be used to
derive spending boundaries for all group sequential design derivations
and power calculations. We demonstrate with the
gs_design_ahr()
function here, using designs under
proportional hazards assumptions to compare with
gsDesign::gsSurv()
. Since the sample size methods differ
between the gsDesign2::gs_design_ahr()
and
gsDesign::gsSurv()
functions, we use continuous sample
sizes so that spending bounds (Zvalues, nominal pvalues, spending) should be identical
except where noted. Indeed, we are able to reproduce bounds to a high
degree of accuracy. Due to the different sample size methods, sample
size and other boundary approximations vary slightly.
We also present a seventh example to implement a futility bound based on observed hazard ratio as well as a HaybittlePetolike efficacy bound. In particular, the futility bound would be difficult to implement using the gsDesign package while it is straightforward using gsDesign2.
For the last two examples, we implement integer sample size and event
counts using the to_integer()
function for the gsDesign2
package and the toInteger()
function for the gsDesign
package. This would generally would be used for all cases other than
when we are comparing package computations as in Examples 1–5.
For all of our examples, we will use the following design assumptions:
trial_duration < 36 # Planned trial duration
info_frac < c(.35, .7, 1) # Information fraction at analyses
# 16 month planned enrollment with constant rate
enroll_rate < define_enroll_rate(duration = 16, rate = 1)
# Minimum followup for gsSurv() (computed)
minfup < trial_duration  sum(enroll_rate$duration)
# Failure rates
fail_rate < define_fail_rate(
duration = Inf, # Single time period, exponential failure
fail_rate = log(2) / 12, # Exponential timetoevent with 12 month median
hr = .7, # Proportional hazards
dropout_rate = log(.99) / 12 # 1% dropout rate per year
)
alpha < 0.025 # Type I error (onesided)
beta < 0.15 # 85% power = 15% Type II error
ratio < 1 # Randomization ratio (experimental / control)
The choice of Type II error of 0.15 corresponding to 85% power is
intentional. This allows for more impactful futility bounds at interim
analyses. Many teams may decide on the more typical 90% power
(beta = .1
), but this can make futility bounds less likely
to impact early decisions.
Examples
Analogous to the gsDesign package, we look at 6 variations on combinations of efficacy and futility bounds.
Example 1: Efficacy bound only
Onesided design has only an efficacy bound. An easy way to do this
is to use a fixed bound (lower = gs_b
) with negative
infinite bounds (lpar = rep(Inf, 3)
); in the summary table
produced, infinite bounds do not appear. The upper bound implements a
spending bound (upper = gs_spending_bound
) and the list of
objects provided in upar
describe the spending function and
any associated parameters. The only parts of the upar
list
used here are sf = gsDesign::sfLDOF
to select a LanDeMets
spending function that approximates an O’BrienFleming bound. The
total_spend = alpha
sets the total spending to the targeted
Type I error for the study. The upper bound provides the Type I error
control for the design as it is not specified elsewhere.
upar < list(sf = gsDesign::sfLDOF, total_spend = alpha, param = NULL)
one_sided < gsDesign2::gs_design_ahr(
enroll_rate = enroll_rate, fail_rate = fail_rate,
ratio = ratio, beta = beta,
# Information fraction at analyses and trial duration
info_frac = info_frac, analysis_time = trial_duration,
# Precision parameters for computations
r = 32, tol = 1e08,
# Use NULL information for Type I error, H1 information for Type II error (power)
info_scale = "h0_h1_info", # Default
# Upper spending bound and corresponding parameter(s)
upper = gs_spending_bound, upar = upar,
# No lower bound
lower = gs_b, lpar = rep(Inf, 3)
)
one_sided >
summary() >
gsDesign2::as_gt(title = "Efficacy bound only", subtitle = "alphaspending")
Efficacy bound only  
alphaspending  
Bound  Z  Nominal p^{1}  ~HR at bound^{2} 
Cumulative boundary crossing probability



Alternate hypothesis  Null hypothesis  
Analysis: 1 Time: 14.5 N: 356 Events: 100.2 AHR: 0.7 Information fraction: 0.35  
Efficacy  3.61  0.0002  0.4858  0.0352  0.0002 
Analysis: 2 Time: 23.3 N: 393.9 Events: 200.4 AHR: 0.7 Information fraction: 0.7  
Efficacy  2.44  0.0073  0.7083  0.5295  0.0074 
Analysis: 3 Time: 36 N: 393.9 Events: 286.2 AHR: 0.7 Information fraction: 1  
Efficacy  2.00  0.0227  0.7894  0.8500  0.0250 
^{1} Onesided pvalue for experimental vs control treatment. Value < 0.5 favors experimental, > 0.5 favors control.  
^{2} Approximate hazard ratio to cross bound. 
Now we check this with gsDesign::gsSurv().
As noted
above, sample size and event counts vary slightly from the design
derived using gs_design_ahr()
. This also results in
slightly different crossing probabilities under the alternate hypothesis
at interim analyses as well as slightly different approximate hazard
ratios required to cross bounds.
oneSided < gsSurv(
alpha = alpha, beta = beta, timing = info_frac, T = trial_duration, minfup = minfup,
lambdaC = fail_rate$fail_rate, eta = fail_rate$dropout_rate, hr = fail_rate$hr,
r = 32, tol = 1e08, # Precision parameters for computations
test.type = 1, # Onesided bound; efficacy only
# Upper bound parameters
sfu = upar$sf, sfupar = upar$param,
)
oneSided > gsBoundSummary()
#> Analysis Value Efficacy
#> IA 1: 35% Z 3.6128
#> N: 356 p (1sided) 0.0002
#> Events: 100 ~HR at bound 0.4852
#> Month: 14 P(Cross) if HR=1 0.0002
#> P(Cross) if HR=0.7 0.0338
#> IA 2: 70% Z 2.4406
#> N: 394 p (1sided) 0.0073
#> Events: 200 ~HR at bound 0.7079
#> Month: 23 P(Cross) if HR=1 0.0074
#> P(Cross) if HR=0.7 0.5341
#> Final Z 2.0002
#> N: 394 p (1sided) 0.0227
#> Events: 286 ~HR at bound 0.7891
#> Month: 36 P(Cross) if HR=1 0.0250
#> P(Cross) if HR=0.7 0.8500
Comparing Zvalue bounds directly we see they are the same through
approximately 6 digits with precision parameters chosen
(r=32
, tol=1e08
):
one_sided$bound$z  oneSided$upper$bound
#> [1] 1.349247e07 9.218765e07 3.515345e07
Example 2: Symmetric 2sided design
We now derive a symmetric 2sided design. This requires use of the
argument h1_spending = FALSE
to use \alphaspending for both the upper and lower
bounds. While the lower bound is labeled as a futility bound in the
table, it would be better termed an efficacy bound for control better
than experimental treatment.
upar < list(sf = gsDesign::sfLDOF, total_spend = alpha, param = NULL)
lpar < list(sf = gsDesign::sfLDOF, total_spend = alpha, param = NULL)
symmetric < gs_design_ahr(
enroll_rate = enroll_rate, fail_rate = fail_rate,
ratio = ratio, beta = beta,
# Information fraction at analyses and trial duration
info_frac = info_frac, analysis_time = trial_duration,
# Precision parameters for computations
r = 32, tol = 1e08,
# Use NULL information for Type I error, H1 information for power
info_scale = "h0_h1_info", # Default
# Function and parameter(s) for upper spending bound
upper = gs_spending_bound, upar = upar,
lower = gs_spending_bound, lpar = lpar,
# Symmetric designs use binding bounds
binding = TRUE,
h1_spending = FALSE # Use null hypothesis spending for lower bound
)
symmetric >
summary() >
gsDesign2::as_gt(
title = "2sided Symmetric Design",
subtitle = "Single spending function"
)
2sided Symmetric Design  
Single spending function  
Bound  Z  Nominal p^{1}  ~HR at bound^{2} 
Cumulative boundary crossing probability



Alternate hypothesis  Null hypothesis  
Analysis: 1 Time: 14.5 N: 356 Events: 100.2 AHR: 0.7 Information fraction: 0.35  
Futility  3.61  0.9998  2.0584  0.0000  0.0002 
Efficacy  3.61  0.0002  0.4858  0.0352  0.0002 
Analysis: 2 Time: 23.3 N: 393.9 Events: 200.4 AHR: 0.7 Information fraction: 0.7  
Futility  2.44  0.9927  1.4118  0.0000  0.0074 
Efficacy  2.44  0.0073  0.7083  0.5295  0.0074 
Analysis: 3 Time: 36 N: 393.9 Events: 286.2 AHR: 0.7 Information fraction: 1  
Futility  2.00  0.9773  1.2668  0.0000  0.0250 
Efficacy  2.00  0.0227  0.7894  0.8500  0.0250 
^{1} Onesided pvalue for experimental vs control treatment. Value < 0.5 favors experimental, > 0.5 favors control.  
^{2} Approximate hazard ratio to cross bound. 
We compare with gsDesign::gsSurv()
.
Symmetric <
gsSurv(
test.type = 2, # Twosided symmetric bound
alpha = alpha, beta = beta, timing = info_frac, T = trial_duration, minfup = minfup, r = 32, tol = 1e08,
lambdaC = fail_rate$fail_rate, eta = fail_rate$dropout_rate, hr = fail_rate$hr,
sfu = upar$sf, sfupar = upar$param
)
Symmetric > gsBoundSummary()
#> Analysis Value Efficacy Futility
#> IA 1: 35% Z 3.6128 3.6128
#> N: 356 p (1sided) 0.0002 0.0002
#> Events: 100 ~HR at bound 0.4852 2.0609
#> Month: 14 P(Cross) if HR=1 0.0002 0.0002
#> P(Cross) if HR=0.7 0.0338 0.0000
#> IA 2: 70% Z 2.4406 2.4406
#> N: 394 p (1sided) 0.0073 0.0073
#> Events: 200 ~HR at bound 0.7079 1.4126
#> Month: 23 P(Cross) if HR=1 0.0074 0.0074
#> P(Cross) if HR=0.7 0.5341 0.0000
#> Final Z 2.0002 2.0002
#> N: 394 p (1sided) 0.0227 0.0227
#> Events: 286 ~HR at bound 0.7891 1.2673
#> Month: 36 P(Cross) if HR=1 0.0250 0.0250
#> P(Cross) if HR=0.7 0.8500 0.0000
Comparing Zvalue bounds directly, we again see approximately 6 digits of accuracy.
Example 3: Asymmetric 2sided design with \betaspending and binding futility
Designs with binding futility bounds are generally not considered acceptable for Phase 3 trials as Type I error is not controlled if a futility bound is crossed and the trial continues, a not infrequent occurrence. A binding futility bound means that Type I error computations assume that a trial stops when a futility bound is crossed. If the trial continues after a futility bound has been crossed, Type I error is no longer controlled with the computed efficacy bound. For a Phase 2b study, this may be acceptable and results in a slightly smaller sample size and less stringent efficacy bounds after the first analysis than a comparable design with a nonbinding futility bound presented in Example 4.
upar < list(sf = gsDesign::sfLDOF, total_spend = alpha, param = NULL)
lpar < list(sf = gsDesign::sfHSD, total_spend = beta, param = .5)
asymmetric_binding < gs_design_ahr(
enroll_rate = enroll_rate, fail_rate = fail_rate,
ratio = ratio, beta = beta,
# Information fraction at analyses and trial duration
info_frac = info_frac, analysis_time = trial_duration,
# Precision parameters for computations
r = 32, tol = 1e08,
# Use NULL information for Type I error, H1 information for Type II error and power
info_scale = "h0_h1_info",
# Function and parameter(s) for upper spending bound
upper = gs_spending_bound, upar = upar,
lower = gs_spending_bound, lpar = lpar,
# Asymmetric betaspending design using binding bounds
binding = TRUE,
h1_spending = TRUE # Use betaspending for futility
)
asymmetric_binding >
summary() >
gsDesign2::as_gt(
title = "2sided asymmetric design with binding futility",
subtitle = "Both alpha and betaspending used"
)
2sided asymmetric design with binding futility  
Both alpha and betaspending used  
Bound  Z  Nominal p^{1}  ~HR at bound^{2} 
Cumulative boundary crossing probability



Alternate hypothesis  Null hypothesis  
Analysis: 1 Time: 14.5 N: 380 Events: 106.9 AHR: 0.7 Information fraction: 0.35  
Futility  0.12  0.4540  0.9779  0.0435  0.5460 
Efficacy  3.61  0.0002  0.4972  0.0400  0.0002 
Analysis: 2 Time: 23.3 N: 420.5 Events: 213.9 AHR: 0.7 Information fraction: 0.7  
Futility  1.15  0.1243  0.8540  0.0960  0.8861 
Efficacy  2.44  0.0074  0.7164  0.5621  0.0074 
Analysis: 3 Time: 36 N: 420.5 Events: 305.5 AHR: 0.7 Information fraction: 1  
Futility  1.91  0.0282  0.8039  0.1499  0.9740 
Efficacy  1.93  0.0268  0.8019  0.8500  0.0250 
^{1} Onesided pvalue for experimental vs control treatment. Value < 0.5 favors experimental, > 0.5 favors control.  
^{2} Approximate hazard ratio to cross bound. 
We compare with gsDesign::gsSurv()
.
asymmetricBinding < gsSurv(
test.type = 3, # Twosided asymmetric bound, binding futility
alpha = alpha, beta = beta, timing = info_frac, T = trial_duration, minfup = minfup, r = 32, tol = 1e07,
lambdaC = fail_rate$fail_rate, eta = fail_rate$dropout_rate, hr = fail_rate$hr,
sfu = upar$sf, sfupar = upar$param, sfl = lpar$sf, sflpar = lpar$param
)
asymmetricBinding > gsBoundSummary()
#> Analysis Value Efficacy Futility
#> IA 1: 35% Z 3.6128 0.1436
#> N: 380 p (1sided) 0.0002 0.4429
#> Events: 107 ~HR at bound 0.4971 0.9726
#> Month: 14 P(Cross) if HR=1 0.0002 0.5571
#> P(Cross) if HR=0.7 0.0387 0.0442
#> IA 2: 70% Z 2.4382 1.1807
#> N: 422 p (1sided) 0.0074 0.1189
#> Events: 214 ~HR at bound 0.7164 0.8509
#> Month: 23 P(Cross) if HR=1 0.0074 0.8913
#> P(Cross) if HR=0.7 0.5679 0.0969
#> Final Z 1.9232 1.9232
#> N: 422 p (1sided) 0.0272 0.0272
#> Events: 306 ~HR at bound 0.8024 0.8024
#> Month: 36 P(Cross) if HR=1 0.0250 0.9750
#> P(Cross) if HR=0.7 0.8500 0.1500
Comparing Zvalue bounds directly, we again see approximately 6
digits of accuracy in spite of needing to relaxing accuracy to
tol = 1e07
in the call to gsSurv()
in order
to get convergence.
Example 4: Asymmetric 2sided design with \betaspending and nonbinding futility bound
In the gsDesign package, asymmetric designs with nonbinding \betaspending used for futility are the default design. The objectives of this type of design include:
 Meaningful futility bounds to stop a trial early if no treatment benefit is emerging for the experimental treatment vs. control.
 Type I error is controlled even if the trial continues after a futility bound is crossed.
upar < list(sf = gsDesign::sfLDOF, total_spend = alpha, param = NULL)
lpar < list(sf = gsDesign::sfHSD, total_spend = beta, param = .5)
asymmetric_nonbinding < gs_design_ahr(
enroll_rate = enroll_rate, fail_rate = fail_rate,
ratio = ratio, beta = beta,
# Information fraction at analyses and trial duration
info_frac = info_frac, analysis_time = trial_duration,
# Precision parameters for computations
r = 32, tol = 1e08,
# Use NULL information for Type I error, H1 info for Type II error and power
info_scale = "h0_h1_info", # Default
# Function and parameter(s) for upper spending bound
upper = gs_spending_bound, upar = upar,
lower = gs_spending_bound, lpar = lpar,
# Asymmetric betaspending design use binding bounds
binding = FALSE,
h1_spending = TRUE # Use betaspending for futility
)
asymmetric_nonbinding >
summary() >
gsDesign2::as_gt(
title = "2sided asymmetric design with nonbinding futility",
subtitle = "Both alpha and betaspending used"
)
2sided asymmetric design with nonbinding futility  
Both alpha and betaspending used  
Bound  Z  Nominal p^{1}  ~HR at bound^{2} 
Cumulative boundary crossing probability



Alternate hypothesis  Null hypothesis  
Analysis: 1 Time: 14.5 N: 395.9 Events: 111.4 AHR: 0.7 Information fraction: 0.35  
Futility  0.15  0.4391  0.9714  0.0435  0.5609 
Efficacy  3.61  0.0002  0.5043  0.0433  0.0002 
Analysis: 2 Time: 23.3 N: 438.1 Events: 222.8 AHR: 0.7 Information fraction: 0.7  
Futility  1.21  0.1136  0.8506  0.0960  0.8960 
Efficacy  2.44  0.0073  0.7211  0.5822  0.0073 
Analysis: 3 Time: 36 N: 438.1 Events: 318.3 AHR: 0.7 Information fraction: 1  
Futility  1.98  0.0241  0.8013  0.1499  0.9773 
Efficacy  2.00  0.0227  0.7991  0.8500  ^{3} 0.0218 
^{1} Onesided pvalue for experimental vs control treatment. Value < 0.5 favors experimental, > 0.5 favors control.  
^{2} Approximate hazard ratio to cross bound.  
^{3} Cumulative alpha for final analysis (0.0218) is less than the full alpha (0.025) when the futility bound is nonbinding. The smaller value subtracts the probability of crossing a futility bound before crossing an efficacy bound at a later analysis (0.025  0.0032 = 0.0218) under the null hypothesis. 
We compare with gsDesign::gsSurv()
.
asymmetricNonBinding < gsSurv(
test.type = 4, # Twosided asymmetric bound, nonbinding futility
alpha = alpha, beta = beta, timing = info_frac, T = trial_duration, minfup = minfup, r = 32, tol = 1e08,
lambdaC = fail_rate$fail_rate, eta = fail_rate$dropout_rate, hr = fail_rate$hr,
sfu = upar$sf, sfupar = upar$param, sfl = lpar$sf, sflpar = lpar$param
)
asymmetricNonBinding > gsBoundSummary()
#> Analysis Value Efficacy Futility
#> IA 1: 35% Z 3.6128 0.1860
#> N: 398 p (1sided) 0.0002 0.4262
#> Events: 112 ~HR at bound 0.5050 0.9654
#> Month: 14 P(Cross) if HR=1 0.0002 0.5738
#> P(Cross) if HR=0.7 0.0424 0.0442
#> IA 2: 70% Z 2.4406 1.2406
#> N: 440 p (1sided) 0.0073 0.1074
#> Events: 224 ~HR at bound 0.7215 0.8471
#> Month: 23 P(Cross) if HR=1 0.0073 0.9020
#> P(Cross) if HR=0.7 0.5901 0.0969
#> Final Z 2.0002 2.0002
#> N: 440 p (1sided) 0.0227 0.0227
#> Events: 320 ~HR at bound 0.7995 0.7995
#> Month: 36 P(Cross) if HR=1 0.0215 0.9785
#> P(Cross) if HR=0.7 0.8500 0.1500
Comparing Zvalue bounds directly, we again see approximately 6 digits of accuracy.
Example 5: Asymmetric 2sided design with null hypothesis spending and binding futility bound
Now we use null hypothesis probabilities to set futility bounds. The
parameter alpha_star
is used to set the total spending for
the futility bound under the null hypothesis. For our example, this is
set to 0.5 which is a 50% probability of crossing the futility bound at
the interim and final analyses combined. The futility bound at the final
analysis really has no role, so we use the test_lower
argument to eliminate this evaluation at the final analysis. This is
arbitrary and largely selected so that the interim futility bounds can
be meaningful tests. In this case, more than a minor trend in favor of
control at the first or second interim will cross a futility bound. This
is less stringent than the \betaspending bounds previously described,
but still address a potential ethical issue of continuing the trial when
more than a minor trend in favor of control is present.
alpha_star < .5
upar < list(sf = gsDesign::sfLDOF, total_spend = alpha, param = NULL)
lpar < list(sf = gsDesign::sfHSD, total_spend = alpha_star, param = 1)
asymmetric_safety_binding < gs_design_ahr(
enroll_rate = enroll_rate, fail_rate = fail_rate,
ratio = ratio, beta = beta,
# Information fraction at analyses and trial duration
info_frac = info_frac, analysis_time = trial_duration,
# Precision parameters for computations
r = 32, tol = 1e08,
# Use NULL information for Type I error, H1 information for Type II error
info_scale = "h0_info",
# Function and parameter(s) for upper spending bound
upper = gs_spending_bound, upar = upar,
lower = gs_spending_bound, lpar = lpar,
test_lower = c(TRUE, TRUE, FALSE),
# Asymmetric design use binding bounds
binding = TRUE,
h1_spending = FALSE # Use nullspending for futility
)
asymmetric_safety_binding >
summary() >
gsDesign2::as_gt(
title = "2sided asymmetric safety design with binding futility",
subtitle = "Alphaspending used for both bounds, asymmetrically"
)
2sided asymmetric safety design with binding futility  
Alphaspending used for both bounds, asymmetrically  
Bound  Z  Nominal p^{1}  ~HR at bound^{2} 
Cumulative boundary crossing probability



Alternate hypothesis  Null hypothesis  
Analysis: 1 Time: 14.5 N: 359.6 Events: 101.2 AHR: 0.7 Information fraction: 0.35  
Futility  0.73  0.7664  1.1561  0.0060  0.2336 
Efficacy  3.61  0.0002  0.4863  0.0340  0.0002 
Analysis: 2 Time: 23.3 N: 397.9 Events: 202.4 AHR: 0.7 Information fraction: 0.7  
Futility  0.42  0.6629  1.0611  0.0070  0.3982 
Efficacy  2.44  0.0073  0.7087  0.5353  0.0074 
Analysis: 3 Time: 36 N: 397.9 Events: 289.1 AHR: 0.7 Information fraction: 1  
Efficacy  2.00  0.0229  0.7899  0.8500  0.0250 
^{1} Onesided pvalue for experimental vs control treatment. Value < 0.5 favors experimental, > 0.5 favors control.  
^{2} Approximate hazard ratio to cross bound. 
asymmetricSafetyBinding < gsSurv(
test.type = 5, # Twosided asymmetric bound, binding futility, H0 futility spending
astar = alpha_star, # Total Type I error spend for futility
alpha = alpha, beta = beta, timing = info_frac, T = trial_duration, minfup = minfup,
lambdaC = fail_rate$fail_rate, eta = fail_rate$dropout_rate, hr = fail_rate$hr,
sfu = upar$sf, sfupar = upar$param, sfl = lpar$sf, sflpar = lpar$param
)
asymmetricSafetyBinding > gsBoundSummary()
#> Analysis Value Efficacy Futility
#> IA 1: 35% Z 3.6128 0.7271
#> N: 356 p (1sided) 0.0002 0.7664
#> Events: 101 ~HR at bound 0.4856 1.1565
#> Month: 14 P(Cross) if HR=1 0.0002 0.2336
#> P(Cross) if HR=0.7 0.0340 0.0060
#> IA 2: 70% Z 2.4405 0.4203
#> N: 394 p (1sided) 0.0073 0.6629
#> Events: 201 ~HR at bound 0.7082 1.0612
#> Month: 23 P(Cross) if HR=1 0.0074 0.3982
#> P(Cross) if HR=0.7 0.5353 0.0070
#> Final Z 1.9979 0.2531
#> N: 394 p (1sided) 0.0229 0.5999
#> Events: 286 ~HR at bound 0.7895 1.0304
#> Month: 36 P(Cross) if HR=1 0.0250 0.5000
#> P(Cross) if HR=0.7 0.8500 0.0072
Comparing Zvalue bounds directly, we again see approximately 6
digits of accuracy. For gsSurv()
this did not require the
alternate arguments for r
and tol
.
Example 6: Asymmetric 2sided design with null hypothesis spending and nonbinding futility bound
Again, we would recommend a nonbinding bound presented here over the
binding bound in example 5. We again eliminate the final futility bound
using the test_lower
argument. Addition, we show how to
eliminate the efficacy bound at interim 1 allowing a team to decide that
it is too early to stop a trial for efficacy without longerterm
data.
upar < list(sf = gsDesign::sfLDOF, total_spend = alpha, param = NULL)
lpar < list(sf = gsDesign::sfHSD, total_spend = alpha_star, param = 1)
asymmetric_safety_nonbinding < gs_design_ahr(
enroll_rate = enroll_rate, fail_rate = fail_rate,
ratio = ratio, beta = beta,
# Information fraction at analyses and trial duration
info_frac = info_frac, analysis_time = trial_duration,
# Precision parameters for computations
r = 32, tol = 1e08,
# Use NULL information for Type I error, H1 information for Type II error
info_scale = "h0_info",
# Function and parameter(s) for upper spending bound
upper = gs_spending_bound, upar = upar,
test_upper = c(FALSE, TRUE, TRUE),
lower = gs_spending_bound, lpar = lpar,
test_lower = c(TRUE, TRUE, FALSE),
# Asymmetric design use nonbinding bounds
binding = FALSE,
h1_spending = FALSE # Use nullspending for futility
) > to_integer()
asymmetric_safety_nonbinding >
summary() >
gsDesign2::as_gt(
title = "2sided asymmetric safety design with nonbinding futility",
subtitle = "Alphaspending used for both bounds, asymmetrically"
) >
gt::tab_footnote(footnote = "Integerbased sample size and event counts")
2sided asymmetric safety design with nonbinding futility  
Alphaspending used for both bounds, asymmetrically  
Bound  Z  Nominal p^{1}  ~HR at bound^{2} 
Cumulative boundary crossing probability



Alternate hypothesis  Null hypothesis  
Analysis: 1 Time: 14.4 N: 360 Events: 101 AHR: 0.7 Information fraction: 0.35  
Futility  1.05  0.1460  0.8108  0.2301  0.8540 
Analysis: 2 Time: 23.1 N: 400 Events: 202 AHR: 0.7 Information fraction: 0.7  
Futility  2.11  0.0175  0.7433  0.3950  0.9856 
Efficacy  2.46  0.0070  0.7079  0.5002  0.0063 
Analysis: 3 Time: 35.9 N: 400 Events: 290 AHR: 0.7 Information fraction: 1  
Efficacy  2.00  0.0229  0.7909  0.5980  ^{3} 0.0097 
Integerbased sample size and event counts  
^{1} Onesided pvalue for experimental vs control treatment. Value < 0.5 favors experimental, > 0.5 favors control.  
^{2} Approximate hazard ratio to cross bound.  
^{3} Cumulative alpha for final analysis (0.0097) is less than the full alpha (0.025) when the futility bound is nonbinding. The smaller value subtracts the probability of crossing a futility bound before crossing an efficacy bound at a later analysis (0.025  0.0153 = 0.0097) under the null hypothesis. 
The corresponding gsDesign::gsSurv()
design is not
strictly comparable since the option to eliminate some futility and
efficacy analyses is not enabled.
asymmetricSafetyNonBinding < gsSurv(
test.type = 6, # Twosided asymmetric bound, binding futility, H0 futility spending
astar = alpha_star, # Total Type I error spend for futility
alpha = alpha, beta = beta, timing = info_frac, T = trial_duration, minfup = minfup, r = 32, tol = 1e08,
lambdaC = fail_rate$fail_rate, eta = fail_rate$dropout_rate, hr = fail_rate$hr,
sfu = upar$sf, sfupar = upar$param, sfl = lpar$sf, sflpar = lpar$param
)
asymmetricSafetyBinding > gsBoundSummary()
#> Analysis Value Efficacy Futility
#> IA 1: 35% Z 3.6128 0.7271
#> N: 356 p (1sided) 0.0002 0.7664
#> Events: 101 ~HR at bound 0.4856 1.1565
#> Month: 14 P(Cross) if HR=1 0.0002 0.2336
#> P(Cross) if HR=0.7 0.0340 0.0060
#> IA 2: 70% Z 2.4405 0.4203
#> N: 394 p (1sided) 0.0073 0.6629
#> Events: 201 ~HR at bound 0.7082 1.0612
#> Month: 23 P(Cross) if HR=1 0.0074 0.3982
#> P(Cross) if HR=0.7 0.5353 0.0070
#> Final Z 1.9979 0.2531
#> N: 394 p (1sided) 0.0229 0.5999
#> Events: 286 ~HR at bound 0.7895 1.0304
#> Month: 36 P(Cross) if HR=1 0.0250 0.5000
#> P(Cross) if HR=0.7 0.8500 0.0072
Example 7: Alternate bound types
We consider two types of alternative boundary computation approaches.
 Computing futility bounds based on a hazard ratio.
 Computing efficacy bounds with a HaybittlePeto or a related FlemingHarringtonO’Brien approach.
We begin with a futility bound. We will consider a nonbinding futility bound as it does not impact the efficacy bound. Assume the clinical trial team wishes to stop the trial at the first two interim analyses if a targeted interim hazard ratio is not achieved. This approach can require a bit of iteration (trial and error) to incorporate the final design endpoint count; we skip over this iteration here. We assume we wish to consider stopping for futility if a hazard ratio greater than 1 and 0.9 are observed at interim analyses 1 and 2 with 104 and 209 events observed, respectively. The final analysis is planned for 300 events.
# Targeted events at interim and final analysis
# This is based on above designs and then adjusted, as necessary
targeted_events < c(104, 209, 300)
We wish to translate the hazard ratios specified to corresponding Zvalues; this can be done as follows.
interim_futility_z < gsDesign::hrn2z(hr = c(1, .9), n = targeted_events[1:2])
interim_futility_z
#> [1] 0.0000000 0.7615897
We will add a final futility bound of Inf
, indicating
no final futility analysis; this gives us a vector of Zvalue bounds for
all analyses. For this type of bound, Type II error will be computed
rather based on bounds rather than the spending approach were bounds are
computed based on specified spending.
lower < gs_b
# Allows specifying fixed Zvalues for futility
# Translated HR bounds to Zvalue scale
lpar < c(interim_futility_z, Inf)
For the efficacy bound, we first consider a HaybittlePeto fixed bound for interim analyses. Using a Bonferroni approach, we test at nominal levels 0.001, 0.001, and 0.023 at the 3 analyses. By not accounting for correlations, this will actually not quite use all of the 0.025 1sided Type I error allowed. We allow the user to substitute this code for what follows to verify this.
The alternative approach is to use a fixed spending approach at each analysis as suggested by Fleming, Harrington, and O’Brien (1984). Again, with some iteration not shown, we use a piecewise linear spending function to select interim bounds that match the desired HaybittlePeto interim bounds. However, using this approach a slightly more liberal final bound is achieved that still controls Type I error.
upper < gs_spending_bound
upar < list(
sf = gsDesign::sfLinear,
total_spend = alpha,
param = c(targeted_events[1:2] / targeted_events[3], c(.001, .0018) / .025),
timing = NULL
)
asymmetric_fixed_bounds < gs_design_ahr(
enroll_rate = enroll_rate, fail_rate = fail_rate,
ratio = ratio, beta = beta,
# Information fraction at analyses and trial duration
info_frac = info_frac, analysis_time = trial_duration,
# Precision parameters for computations
r = 32, tol = 1e08,
# Use NULL information for Type I error, H1 information for Type II error
info_scale = "h0_info",
# Function and parameter(s) for upper spending bound
upper = upper, upar = upar,
lower = lower, lpar = lpar,
# Nonbinding futility bounds
binding = FALSE
) > to_integer()
asymmetric_fixed_bounds >
summary() >
gsDesign2::as_gt(
title = "2sided asymmetric safety design with fixed nonbinding futility",
subtitle = "Futility bounds computed to approximate HR"
) >
gt::tab_footnote(footnote = "Integerbased sample size and event counts")
2sided asymmetric safety design with fixed nonbinding futility  
Futility bounds computed to approximate HR  
Bound  Z  Nominal p^{1}  ~HR at bound^{2} 
Cumulative boundary crossing probability



Alternate hypothesis  Null hypothesis  
Analysis: 1 Time: 14.4 N: 372 Events: 104 AHR: 0.7 Information fraction: 0.35  
Futility  0.00  0.5000  1.0000  0.0345  0.5000 
Efficacy  3.09  0.0010  0.5455  0.1018  0.0010 
Analysis: 2 Time: 23.1 N: 414 Events: 209 AHR: 0.7 Information fraction: 0.7  
Futility  0.76  0.2232  0.9000  0.0566  0.8019 
Efficacy  3.10  0.0010  0.6510  0.3181  0.0018 
Analysis: 3 Time: 35.8 N: 414 Events: 300 AHR: 0.7 Information fraction: 1  
Efficacy  1.97  0.0244  0.7966  0.8520  ^{3} 0.0234 
Integerbased sample size and event counts  
^{1} Onesided pvalue for experimental vs control treatment. Value < 0.5 favors experimental, > 0.5 favors control.  
^{2} Approximate hazard ratio to cross bound.  
^{3} Cumulative alpha for final analysis (0.0234) is less than the full alpha (0.025) when the futility bound is nonbinding. The smaller value subtracts the probability of crossing a futility bound before crossing an efficacy bound at a later analysis (0.025  0.0016 = 0.0234) under the null hypothesis. 
We see that the targeted bounds are achieved with nominal pvalues of 0.0001 for each interim efficacy bound and the targeted hazard ratios at interim futility bounds. With these methods, trial designers have more control over design characteristics they may desire. In particular, we note that the HaybittlePeto efficacy bounds are less stringent at the first interim and more stringent at the second interim than corresponding O’BrienFleminglike bounds we computed with the spending approach. This may or may not be desirable.