Process survival data into counting process format — counting

Produces a data frame that is sorted by stratum and time. Included in this is only the times at which one or more event occurs. The output dataset contains stratum, TTE (time-to-event), at risk count, and count of events at the specified TTE sorted by stratum and TTE.

Usage

counting_process(x, arm)

Arguments

x

A data frame with no missing values and contain variables:

stratum: Stratum.
treatment: Treatment group.
tte: Observed time.
event: Binary event indicator, 1 represents event, 0 represents censoring.

arm

Value in the input treatment column that indicates treatment group value.

Value

A data frame grouped by stratum and sorted within stratum by tte. It only includes rows with at least one event in the population, at least one subject is at risk in both treatment group and control group. Other variables in this represent the following within each stratum at each time at which one or more events are observed:

event_total: Total number of events
event_trt: Total number of events at treatment group
n_risk_total: Number of subjects at risk
n_risk_trt: Number of subjects at risk in treatment group
s: Left-continuous Kaplan-Meier survival estimate
o_minus_e: In treatment group, observed number of events minus expected number of events. The expected number of events is estimated by assuming no treatment effect with hypergeometric distribution with parameters total number of events, total number of events at treatment group and number of events at a time. (Same assumption of log-rank test under the null hypothesis)
var_o_minus_e: Variance of o_minus_e under the same assumption.

Details

The function only considered two group situation.

The tie is handled by the Breslow's Method.

The output produced by counting_process() produces a counting process dataset grouped by stratum and sorted within stratum by increasing times where events occur. The object is assigned the class "counting_process". It also has the attribute "ratio", which is the ratio of the events in the treatment arm compared to the control arm in the input time-to-event data. If the input data was generated by sim_pw_surv(), the ratio attribute is simply obtained from the attribute of the same name from the input object. Otherwise, the returned ratio is the empirical ratio of treatment to control events.

Examples

# Example 1
x <- data.frame(
  stratum = c(rep(1, 10), rep(2, 6)),
  treatment = rep(c(1, 1, 0, 0), 4),
  tte = 1:16,
  event = rep(c(0, 1), 8)
)
counting_process(x, arm = 1)
#>   stratum event_total event_trt tte n_risk_total n_risk_trt         s
#> 1       1           1         1   2            9          5 1.0000000
#> 2       1           1         0   4            7          4 0.8888889
#> 3       1           1         1   6            5          3 0.7619048
#> 4       1           1         0   8            3          2 0.6095238
#> 5       2           1         0  12            5          2 1.0000000
#> 6       2           1         1  14            3          1 0.8000000
#>    o_minus_e var_o_minus_e
#> 1  0.4444444     0.2469136
#> 2 -0.5714286     0.2448980
#> 3  0.4000000     0.2400000
#> 4 -0.6666667     0.2222222
#> 5 -0.4000000     0.2400000
#> 6  0.6666667     0.2222222

# Example 2
x <- sim_pw_surv(n = 400)
y <- cut_data_by_event(x, 150) |> counting_process(arm = "experimental")
# Weighted logrank test (Z-value and 1-sided p-value)
z <- sum(y$o_minus_e) / sqrt(sum(y$var_o_minus_e))
c(z, pnorm(z))
#> [1] -3.5701616116  0.0001783805