Skip to contents

Design assumptions

We assume two analyses: an interim analysis (IA) and a final analysis (FA). The IA is planned 20 months after opening enrollment, followed by the FA at month 36. The planned enrollment period spans 14 months, with the first 2 months having an enrollment rate of 1/3 the final rate, the next 2 months with a rate of 2/3 of the final rate, and the final rate for the remaining 10 months. To obtain the targeted 90% power, these rates will be multiplied by a constant. The control arm is assumed to follow an exponential distribution with a median of 9 months and the dropout rate is 0.0001 per month regardless of treatment group. Finally, the experimental treatment group is piecewise exponential with a 3-month delayed treatment effect; that is, in the first 3 months HR = 1 and the HR is 0.6 thereafter.

alpha <- 0.025
beta <- 0.1
ratio <- 1

# Enrollment
enroll_rate <- define_enroll_rate(
  duration = c(2, 2, 10),
  rate = (1:3) / 3
)

# Failure and dropout
fail_rate <- define_fail_rate(
  duration = c(3, Inf),
  fail_rate = log(2) / 9,
  hr = c(1, 0.6),
  dropout_rate = .0001
)
# IA and FA analysis time
analysis_time <- c(20, 36)

# Randomization ratio
ratio <- 1

We use the null hypothesis information for boundary crossing probability calculations under both the null and alternate hypotheses. This will also imply the null hypothesis information will be used for the information fraction used in spending functions to derive the design.

info_scale <- "h0_info"

One-sided design

For the design, we have efficacy bounds at both the IA and FA. We use the Lan and DeMets (1983) spending function with a total alpha of 0.025, which approximates an O’Brien-Fleming bound.

upper <- gs_spending_bound
upar <- list(sf = gsDesign::sfLDOF, total_spend = alpha, param = NULL)

x <- gs_design_ahr(
  enroll_rate = enroll_rate,
  fail_rate = fail_rate,
  alpha = alpha,
  beta = beta,
  info_frac = NULL,
  info_scale = "h0_info",
  analysis_time = analysis_time,
  ratio = ratio,
  upper = gs_spending_bound,
  upar = upar,
  test_upper = TRUE,
  lower = gs_b,
  lpar = rep(-Inf, 2),
  test_lower = FALSE
) |> to_integer()

The planned design targets:

  • Planned events: 193, 297
  • Planned information fraction for interim and final analysis: 0.6498, 1
  • Planned alpha spending: 0.0054, 0.025
  • Planned efficacy bounds: 2.5473, 1.9896

We note that rounding up the final targeted events increases power slightly over the targeted 90%.

x |>
  summary() |>
  as_gt() |>
  tab_header(title = "Original design") |>
  tab_style(
    style = list(
      cell_fill(color = "lightcyan"),
      cell_text(weight = "bold")
    ),
    locations = cells_body(
      columns = Z,
      rows = Bound == "Efficacy"
    )
  )
Original design
Bound Z Nominal p1 ~HR at bound2 Cumulative boundary crossing probability
Alternate hypothesis Null hypothesis
Analysis: 1 Time: 19.9 N: 366 Event: 193 AHR: 0.73 Information fraction: 0.65
Efficacy 2.55 0.0054 0.6930 0.3494 0.0054
Analysis: 2 Time: 35.8 N: 366 Event: 297 AHR: 0.68 Information fraction: 1
Efficacy 1.99 0.0233 0.7938 0.9023 0.0250
1 One-sided p-value for experimental vs control treatment. Value < 0.5 favors experimental, > 0.5 favors control.
2 Approximate hazard ratio to cross bound.

We provide a simulation below where 188 and 295 events observed at the IA and FA, respectively. We will assume the differences from planned (193, 297) are due to logistical considerations. We also assume the protocol specifies that the full \(\alpha\) will be spent at the final analysis even in a case like this where there is a shortfall of events versus the design plan.

The observed data for this example is generated by simtrial::sim_pw_surv().

set.seed(123)

observed_data <- simtrial::sim_pw_surv(
  n = x$analysis$n[x$analysis$analysis == 2],
  stratum = data.frame(stratum = "All", p = 1),
  block = c(rep("control", 2), rep("experimental", 2)),
  enroll_rate = x$enroll_rate,
  fail_rate = (fail_rate |> simtrial::to_sim_pw_surv())$fail_rate,
  dropout_rate = (fail_rate |> simtrial::to_sim_pw_surv())$dropout_rate
)

observed_data_ia <- observed_data |> simtrial::cut_data_by_date(analysis_time[1])
observed_data_fa <- observed_data |> simtrial::cut_data_by_date(analysis_time[2])

The updated design is

gs_update_ahr(
  x = x,
  ia_alpha_spending = "actual_info_frac",
  fa_alpha_spending = "full_alpha",
  observed_data = list(observed_data_ia, observed_data_fa)
) |>
  summary(
    col_vars = c(
      "analysis", "bound", "z", "~hr at bound",
      "nominal p", "Alternate hypothesis", "Null hypothesis"
    ),
    col_decimals = c(NA, NA, 4, 4, 4, 4, 4)
  ) |>
  as_gt() |>
  tab_style(
    style = list(
      cell_fill(color = "lightcyan"),
      cell_text(weight = "bold")
    ),
    locations = cells_body(
      columns = Z,
      rows = Bound == "Efficacy"
    )
  )
## Joining with `by = join_by(analysis, bound, z)`
Bound summary for AHR design
AHR approximations of ~HR at bound
Bound Z Nominal p1 ~HR at bound2 Cumulative boundary crossing probability
Alternate hypothesis Null hypothesis
Analysis: 1 Time: 19.9 N: 366 Event: 188 AHR: 0.74 Information fraction: 0.63
Efficacy 2.5868 0.0048 0.6857 0.3150 0.0048
Analysis: 2 Time: 35.8 N: 366 Event: 295 AHR: 0.68 Information fraction: 1
Efficacy 1.9860 0.0235 0.7935 0.9009 0.0250
1 One-sided p-value for experimental vs control treatment. Value < 0.5 favors experimental, > 0.5 favors control.
2 Approximate hazard ratio to cross bound.

Two-sided asymmetric design, beta-spending with non-binding lower bound

In this section, we investigate a 2 sided asymmetric design, with the non-binding beta-spending futility bounds. Beta-spending refers to error spending for the lower bound crossing probabilities under the alternative hypothesis. Non-binding assumes the trial continues if the lower bound is crossed for Type I, but not Type II error computation.

In the original designs, we employ the Lan-DeMets spending function used to approximate O’Brien-Fleming bounds (Lan and DeMets 1983) for both efficacy and futility bounds. The total spending for efficacy is 0.025, and for futility is 0.1. Besides, we assume the futility test only happens at IA.

upper <- gs_spending_bound
upar <- list(sf = gsDesign::sfLDOF, total_spend = alpha, param = NULL)
lower <- gs_spending_bound
lpar <- list(sf = gsDesign::sfLDOF, total_spend = beta, param = NULL)

x <- gs_design_ahr(
  enroll_rate = enroll_rate,
  fail_rate = fail_rate,
  alpha = alpha,
  beta = beta,
  info_frac = NULL,
  info_scale = "h0_info",
  analysis_time = c(20, 36),
  ratio = ratio,
  upper = gs_spending_bound,
  upar = upar,
  test_upper = TRUE,
  lower = lower,
  lpar = lpar,
  test_lower = c(TRUE, FALSE),
  binding = FALSE
) |> to_integer()

In the planned design, we have

  • Planned events: 202, 311
  • Planned information fraction (timing): 0.6495, 1
  • Planned alpha spending: 0.0054167, 0.025
  • Planned efficacy bounds: 2.548, 1.9895
  • Planned futility bounds: 0.4778

Since we added futility bounds, the sample size and number of events are larger than what we have in the 1-sided example.

x |>
  summary() |>
  as_gt() |>
  tab_header(title = "Original design") |>
  tab_style(
    style = list(
      cell_fill(color = "lightcyan"),
      cell_text(weight = "bold")
    ),
    locations = cells_body(
      columns = Z,
      rows = Bound == "Efficacy"
    )
  ) |>
  tab_style(
    style = list(
      cell_fill(color = "#F9E3D6"),
      cell_text(weight = "bold")
    ),
    locations = cells_body(
      columns = Z,
      rows = Bound == "Futility"
    )
  )
Original design
Bound Z Nominal p1 ~HR at bound2 Cumulative boundary crossing probability
Alternate hypothesis Null hypothesis
Analysis: 1 Time: 19.9 N: 382 Event: 202 AHR: 0.73 Information fraction: 0.65
Futility 0.48 0.3164 0.9350 0.0413 0.6836
Efficacy 2.55 0.0054 0.6987 0.3692 0.0054
Analysis: 2 Time: 36.1 N: 382 Event: 311 AHR: 0.68 Information fraction: 1
Efficacy 1.99 0.0233 0.7980 0.9030 3 0.0247
1 One-sided p-value for experimental vs control treatment. Value < 0.5 favors experimental, > 0.5 favors control.
2 Approximate hazard ratio to cross bound.
3 Cumulative alpha for final analysis (0.0247) is less than the full alpha (0.025) when the futility bound is non-binding. The smaller value subtracts the probability of crossing a futility bound before crossing an efficacy bound at a later analysis (0.025 - 0.0003 = 0.0247) under the null hypothesis.

For simplicity in presentation, we assume the observed events are the same as that in the 1-sided design.

The updated design is

gs_update_ahr(
  x = x,
  ia_alpha_spending = "actual_info_frac",
  fa_alpha_spending = "full_alpha",
  observed_data = list(observed_data_ia, observed_data_fa)
) |>
  summary(
    col_vars = c(
      "analysis", "bound", "z", "~hr at bound",
      "nominal p", "Alternate hypothesis", "Null hypothesis"
    ),
    col_decimals = c(NA, NA, 4, 4, 4, 4, 4)
  ) |>
  as_gt() |>
  tab_style(
    style = list(
      cell_fill(color = "lightcyan"),
      cell_text(weight = "bold")
    ),
    locations = cells_body(
      columns = Z,
      rows = Bound == "Efficacy"
    )
  ) |>
  tab_style(
    style = list(
      cell_fill(color = "#F9E3D6"),
      cell_text(weight = "bold")
    ),
    locations = cells_body(
      columns = Z,
      rows = Bound == "Futility"
    )
  )
## Joining with `by = join_by(analysis, bound, z)`
Bound summary for AHR design
AHR approximations of ~HR at bound
Bound Z Nominal p1 ~HR at bound2 Cumulative boundary crossing probability
Alternate hypothesis Null hypothesis
Analysis: 1 Time: 19.9 N: 382 Event: 188 AHR: 0.74 Information fraction: 0.6
Futility 0.2850 0.3878 0.9593 0.0344 0.6122
Efficacy 2.6571 0.0039 0.6787 0.2904 0.0039
Analysis: 2 Time: 36.1 N: 382 Event: 295 AHR: 0.68 Information fraction: 1
Efficacy 1.9788 0.0239 0.7942 0.8939 3 0.0248
1 One-sided p-value for experimental vs control treatment. Value < 0.5 favors experimental, > 0.5 favors control.
2 Approximate hazard ratio to cross bound.
3 Cumulative alpha for final analysis (0.0248) is less than the full alpha (0.025) when the futility bound is non-binding. The smaller value subtracts the probability of crossing a futility bound before crossing an efficacy bound at a later analysis (0.025 - 0.0002 = 0.0248) under the null hypothesis.

References

Lan, K. K. Gordon, and David L DeMets. 1983. “Discrete Sequential Boundaries for Clinical Trials.” Biometrika 70 (3): 659–63.