Efficacy and futility boundary update
Yujie Zhao and Keaven M. Anderson
Source:vignettes/articles/storyupdateboundary.Rmd
storyupdateboundary.Rmd
Design assumptions
We assume two analyses: an interim analysis (IA) and a final analysis (FA). The IA is planned 20 months after opening enrollment, followed by the FA at month 36. The planned enrollment period spans 14 months, with the first 2 months having an enrollment rate of 1/3 the final rate, the next 2 months with a rate of 2/3 of the final rate, and the final rate for the remaining 10 months. To obtain the targeted 90% power, these rates will be multiplied by a constant. The control arm is assumed to follow an exponential distribution with a median of 9 months and the dropout rate is 0.0001 per month regardless of treatment group. Finally, the experimental treatment group is piecewise exponential with a 3month delayed treatment effect; that is, in the first 3 months HR = 1 and the HR is 0.6 thereafter.
alpha < 0.0125
beta < 0.1
ratio < 1
# Enrollment
enroll_rate < define_enroll_rate(
duration = c(2, 2, 10),
rate = (1:3) / 3
)
# Failure and dropout
fail_rate < define_fail_rate(
duration = c(3, Inf),
fail_rate = log(2) / 9,
hr = c(1, 0.6),
dropout_rate = .0001
)
# IA and FA analysis time
analysis_time < c(20, 36)
# Randomization ratio
ratio < 1
We use the null hypothesis information for boundary crossing probability calculations under both the null and alternate hypotheses. This will also imply the null hypothesis information will be used for the information fraction used in spending functions to derive the design.
info_scale < "h0_info"
Onesided design
For the design, we have efficacy bounds at both the IA and FA. We use the Lan and DeMets (1983) spending function with a total alpha of 0.0125, which approximates an O’BrienFleming bound.
upper < gs_spending_bound
upar < list(sf = gsDesign::sfLDOF, total_spend = alpha, param = NULL)
x < gs_design_ahr(
enroll_rate = enroll_rate,
fail_rate = fail_rate,
alpha = alpha,
beta = beta,
info_frac = NULL,
info_scale = "h0_info",
analysis_time = analysis_time,
ratio = ratio,
upper = gs_spending_bound,
upar = upar,
test_upper = TRUE,
lower = gs_b,
lpar = rep(Inf, 2),
test_lower = FALSE
) > to_integer()
The planned design targets:
 Planned events: 227, 349
 Planned information fraction for interim and final analysis: 0.6504, 1
 Planned alpha spending: 0.0054, 0.025
 Planned efficacy bounds: 2.8853, 2.2611
We note that rounding up the final targeted events increases power slightly over the targeted 90%.
x >
summary() >
as_gt() >
tab_header(title = "Planned design")
Planned design  
Bound  Z  Nominal p^{1}  ~HR at bound^{2} 
Cumulative boundary crossing probability



Alternate hypothesis  Null hypothesis  
Analysis: 1 Time: 19.9 N: 430 Events: 227 AHR: 0.73 Information fraction: 0.65  
Efficacy  2.89  0.0020  0.6818  0.2943  0.0020 
Analysis: 2 Time: 35.8 N: 430 Events: 349 AHR: 0.68 Information fraction: 1  
Efficacy  2.26  0.0119  0.7850  0.9029  0.0125 
^{1} Onesided pvalue for experimental vs control treatment. Value < 0.5 favors experimental, > 0.5 favors control.  
^{2} Approximate hazard ratio to cross bound. 
Bounds for alternate alpha
At the stage of study design, we may be required to report the designs under multiple \alpha if alpha is reallocated due to rejection of another hypothesis. At the design stage, the planned \alpha is 0.0125. Assume the updated \alpha is 0.025 due to reallocation of \alpha from some other hypothesis. The corresponding bounds are
gs_update_ahr(
x = x,
alpha = 0.025
) >
summary(col_decimals = c(z = 4)) >
as_gt(title = "Updated design",
subtitle = "For alternate alpha = 0.025")
Updated design  
For alternate alpha = 0.025  
Bound  Z  Nominal p^{1}  ~HR at bound^{2} 
Cumulative boundary crossing probability



Alternate hypothesis  Null hypothesis  
Analysis: 1 Time: 19.9 N: 430 Events: 227 AHR: 0.73 Information fraction: 0.65  
Efficacy  2.5459  0.0054  0.7132  0.4202  0.0054 
Analysis: 2 Time: 35.8 N: 430 Events: 349 AHR: 0.68 Information fraction: 1  
Efficacy  1.9897  0.0233  0.8081  0.9418  0.0250 
^{1} Onesided pvalue for experimental vs control treatment. Value < 0.5 favors experimental, > 0.5 favors control.  
^{2} Approximate hazard ratio to cross bound. 
The above updated boundaries utilize the planned treatment effect and
the planned statistical information under null hypothesis, considering
the original design has info_scale = "h0_info"
.
Updating bounds with observed events at time of analyses
We provide a simulation below where observed events the IA and FA differ from planned. In this case the differences from planned are due to using calendarbased cutoffs for the simulated data. In practice, even if attempting to match event counts exactly the observed events at analyses often differ from planned. We also assume the protocol specifies that the full \alpha will be spent at the final analysis even in a case like this when there is a shortfall of events versus the design plan.
The observed data for this example is generated by
simtrial::sim_pw_surv()
.
set.seed(123) # Make simulated data reproducible
# Generate trial data
observed_data < simtrial::sim_pw_surv(
n = x$analysis$n[x$analysis$analysis == 2],
stratum = data.frame(stratum = "All", p = 1),
block = c(rep("control", 2), rep("experimental", 2)),
enroll_rate = x$enroll_rate,
fail_rate = (fail_rate > simtrial::to_sim_pw_surv())$fail_rate,
dropout_rate = (fail_rate > simtrial::to_sim_pw_surv())$dropout_rate
)
# Cut simulated data for interim analysis at planned calendar time
observed_data_ia < observed_data > simtrial::cut_data_by_date(analysis_time[1])
# Cut simulated data for final analysis at planned calendar time
observed_data_fa < observed_data > simtrial::cut_data_by_date(analysis_time[2])
The updated design is
# Set spending fraction for interim according to observed events
# divided by planned final events.
# Final spending fraction is 1 per plan even if there is a shortfall
# of events versus planned (as specified above)
ustime < c(sum(observed_data_ia$event) / max(x$analysis$event), 1)
# Update bound
gs_update_ahr(
x = x,
ustime = ustime,
observed_data = list(observed_data_ia, observed_data_fa)
) >
summary(col_decimals = c(z = 4)) >
as_gt(title = "Updated design",
subtitle = paste0("With observed ", sum(observed_data_ia$event),
" events at IA and ", sum(observed_data_fa$event),
" events at FA"))
Updated design  
With observed 241 events at IA and 353 events at FA  
Bound  Z  Nominal p^{1}  ~HR at bound^{2} 
Cumulative boundary crossing probability



Alternate hypothesis  Null hypothesis  
Analysis: 1 Time: 19.9 N: 430 Events: 241 AHR: 0.73 Information fraction: 0.68  
Efficacy  2.7882  0.0026  0.6982  0.3558  0.0026 
Analysis: 2 Time: 35.8 N: 430 Events: 353 AHR: 0.69 Information fraction: 1  
Efficacy  2.2688  0.0116  0.7854  0.8949  ^{3} 0.0125 
^{1} Onesided pvalue for experimental vs control treatment. Value < 0.5 favors experimental, > 0.5 favors control.  
^{2} Approximate hazard ratio to cross bound.  
^{3} Cumulative alpha for final analysis (0.0125) is less than the full alpha (0.025) when the futility bound is nonbinding. The smaller value subtracts the probability of crossing a futility bound before crossing an efficacy bound at a later analysis (0.025  0.0125 = 0.0125) under the null hypothesis. 
Twosided asymmetric design, betaspending with nonbinding lower bound
In this section, we investigate a 2 sided asymmetric design, with a nonbinding \betaspending used to generate futility bounds. \betaspending refers to Type II error (1  power) spending for the lower bound crossing probabilities under the alternative hypothesis. Nonbinding bound computation assumes the trial continues if the lower bound is crossed for Type I error, but not Type II error.
In the original designs, we employ the LanDeMets spending function used to approximate O’BrienFleming bounds (Lan and DeMets 1983) for both efficacy and futility bounds. The total spending for efficacy is 0.0125, and for futility is 0.1. In addition, we assume there is no futility test for the final analysis.
# Upper and lower bounds uses spending with LanDeMets spending approximating
# O'BrienFleming bound
upper < gs_spending_bound
upar < list(sf = gsDesign::sfLDOF, total_spend = alpha, param = NULL)
lower < gs_spending_bound
lpar < list(sf = gsDesign::sfLDOF, total_spend = beta, param = NULL)
x < gs_design_ahr(
enroll_rate = enroll_rate,
fail_rate = fail_rate,
alpha = alpha,
beta = beta,
info_frac = NULL,
info_scale = "h0_info",
analysis_time = c(20, 36),
ratio = ratio,
upper = gs_spending_bound,
upar = upar,
test_upper = TRUE,
lower = lower,
lpar = lpar,
test_lower = c(TRUE, FALSE),
binding = FALSE
) > to_integer()
In the planned design, we have
 Planned events: 236, 363
 Planned information fraction (timing): 0.6501, 1
 Planned alpha spending: 0.0054388, 0.025
 Planned efficacy bounds: 2.8861, 2.2611
 Planned futility bounds: 0.659
Since we added futility bounds, the sample size and number of events are larger than we had above in the 1sided example.
x >
summary() >
as_gt() >
tab_header(title = "Planned design",
subtitle = "2sided asymmetric design, nonbinding futility")
Planned design  
2sided asymmetric design, nonbinding futility  
Bound  Z  Nominal p^{1}  ~HR at bound^{2} 
Cumulative boundary crossing probability



Alternate hypothesis  Null hypothesis  
Analysis: 1 Time: 19.9 N: 446 Events: 236 AHR: 0.73 Information fraction: 0.65  
Futility  0.66  0.2549  0.9178  0.0414  0.7451 
Efficacy  2.89  0.0020  0.6868  0.3114  0.0020 
Analysis: 2 Time: 36 N: 446 Events: 363 AHR: 0.68 Information fraction: 1  
Efficacy  2.26  0.0119  0.7887  0.9027  ^{3} 0.0124 
^{1} Onesided pvalue for experimental vs control treatment. Value < 0.5 favors experimental, > 0.5 favors control.  
^{2} Approximate hazard ratio to cross bound.  
^{3} Cumulative alpha for final analysis (0.0124) is less than the full alpha (0.0125) when the futility bound is nonbinding. The smaller value subtracts the probability of crossing a futility bound before crossing an efficacy bound at a later analysis (0.0125  0.0001 = 0.0124) under the null hypothesis. 
Bounds for alternate alpha
We may want to report the design bounds under multiple \alpha in the case Type I error may be reallocated from another hypothesis. We assume now that \alpha is 0.025 but we still use the same sample size and event timing as for the original alpha = 0.0125. The updated bounds are
gs_update_ahr(
x = x,
alpha = 0.025
) >
summary(col_decimals = c(z = 4)) >
as_gt(title = "Updated design",
subtitle = "For alpha = 0.025")
Updated design  
For alpha = 0.025  
Bound  Z  Nominal p^{1}  ~HR at bound^{2} 
Cumulative boundary crossing probability



Alternate hypothesis  Null hypothesis  
Analysis: 1 Time: 19.9 N: 446 Events: 236 AHR: 0.73 Information fraction: 0.65  
Futility  0.6590  0.2549  0.9178  0.0414  0.7451 
Efficacy  2.5466  0.0054  0.7178  0.4394  0.0054 
Analysis: 2 Time: 36 N: 446 Events: 363 AHR: 0.68 Information fraction: 1  
Efficacy  1.9897  0.0233  0.8115  0.9310  ^{3} 0.0244 
^{1} Onesided pvalue for experimental vs control treatment. Value < 0.5 favors experimental, > 0.5 favors control.  
^{2} Approximate hazard ratio to cross bound.  
^{3} Cumulative alpha for final analysis (0.0244) is less than the full alpha (0.025) when the futility bound is nonbinding. The smaller value subtracts the probability of crossing a futility bound before crossing an efficacy bound at a later analysis (0.025  0.0006 = 0.0244) under the null hypothesis. 
Updating bounds with observed events at time of analyses
We assume the observed events as for the 1sided example above.
The updated design is
# Update spending fraction as above
ustime < c(sum(observed_data_ia$event) / max(x$analysis$event), 1)
gs_update_ahr(
x = x,
ustime = ustime,
# Spending fraction for futility bound same as for efficacy
lstime = ustime,
observed_data = list(observed_data_ia, observed_data_fa)
) >
summary(col_decimals = c(z = 4)) >
as_gt(title = "Updated design",
subtitle = paste0("With observed ", sum(observed_data_ia$event),
" events at IA and ", sum(observed_data_fa$event),
" events at FA"))
Updated design  
With observed 241 events at IA and 353 events at FA  
Bound  Z  Nominal p^{1}  ~HR at bound^{2} 
Cumulative boundary crossing probability



Alternate hypothesis  Null hypothesis  
Analysis: 1 Time: 19.9 N: 446 Events: 241 AHR: 0.73 Information fraction: 0.68  
Futility  0.7073  0.2397  0.9129  0.0435  0.7603 
Efficacy  2.8518  0.0022  0.6925  0.3324  0.0022 
Analysis: 2 Time: 36 N: 446 Events: 353 AHR: 0.69 Information fraction: 1  
Efficacy  2.2614  0.0119  0.7861  0.8866  ^{3} 0.0124 
^{1} Onesided pvalue for experimental vs control treatment. Value < 0.5 favors experimental, > 0.5 favors control.  
^{2} Approximate hazard ratio to cross bound.  
^{3} Cumulative alpha for final analysis (0.0124) is less than the full alpha (0.025) when the futility bound is nonbinding. The smaller value subtracts the probability of crossing a futility bound before crossing an efficacy bound at a later analysis (0.025  0.0126 = 0.0124) under the null hypothesis. 