Skip to contents

Overview

Binary outcome is a commonly used endpoint in clinical trials. This page illustrates how to conduct the unstratified or stratified analysis with the Miettinen and Nurminen (M&N) method (Miettinen and Nurminen 1985) for risk difference analysis in R. The following statistics can be calculated with the function rate_compare():

  • Estimated risk difference.
  • Test statistic.
  • Confidence interval for the risk difference.
  • p-value for risk difference.

Statistical methods

Unstratified analysis of M&N method

Assume the data includes two independent binomial samples with binary response variables to be analyzed/summarized and the data collected in a clinical design without stratification. Also this approach is applicable to the case when the data are collected using a stratified clinical design and the statistician would like to ignore stratification by pooling the data over strata assuming two independent binomial samples. Assume PiP_i is the proportion of success responses in the test (i=1i=1) or control (i=0i=0) group.

Confidence interval

The confidence interval is based on the M&N method and given by the roots for PD=P1P0PD=P_1-P_0 of the equation:

χα2=(p̂1p̂0PD)2Ṽ\chi_\alpha^2 = \frac{(\hat{p}_1-\hat{p}_0-PD)^2}{\tilde{V}},

where p̂1\hat{p}_1 and p̂0\hat{p}_0 are the observed values of P1P_1 and P0P_0, respectively;

  • χα2\chi_\alpha^2 = the upper cut point of size α\alpha from the central chi-square distribution with 1 degree of freedom (χα2=3.84\chi_\alpha^2 = 3.84 for 9595% confidence interval);

  • PDPD = the difference between two population proportions (PD=P1P0PD=P_1-P_0);

Ṽ=[p̃1(1p̃1)n1+p̃0(1p̃0)n0]n1+n0n1+n01\tilde{V}=\bigg[\frac{\tilde{p}_1(1-\tilde{p}_1)}{n_1}+ \frac{\tilde{p}_0(1-\tilde{p}_0)}{n_0}\bigg]\frac{n_1+n_0}{n_1+n_0-1};

  • n1n_1 and n0n_0 are the sample sizes for the test and control group, respectively;

  • p̃1\tilde{p}_1 = maximum likelihood estimate of proportion on test computed as p̃0+PD\tilde{p}_0+PD;

  • p̃0\tilde{p}_0 = maximum likelihood estimate of proportion on control under the constraint p̃1p̃0=PD\tilde{p}_1-\tilde{p}_0=PD.

As stated above the 2-sided 100(1α)100(1-\alpha)% CI is given by the roots for PD=P1P0PD=P_1-P_0. The bisection algorithm is used in the function to obtain the two roots (confidence interval) for PDPD.

p-value and Z-statistic

The Z-statistic is computed as:

Zdiff=p̂1p̂0+S0ṼZ_\text{diff}=\frac{\hat{p}_1-\hat{p}_0+S_0}{\sqrt{\tilde{V}}} where p̂1\hat{p}_1 and p̂0\hat{p}_0 are the observed values for P1P_1 and P0P_0 respectively, S0S_0 is pre-specified proportion difference under the null;

  • p̃1\tilde{p}_1 = maximum likelihood estimate of proportion on test computed as p̃0+S0\tilde{p}_0+S_0;

  • p̃0\tilde{p}_0 = maximum likelihood estimate of proportion on control under the constraint p̃1p̃0=S0\tilde{p}_1-\tilde{p}_0=S_0.

  • For non-inferiority or one-sided equivalence hypothesis with S0>0S_0>0, the p-value, Pr(ZZdiff|H0)\Pr(Z \geq Z_\text{diff} \, | \, H_0), is computed based on ZdiffZ_\text{diff} using the standard normal distribution.

  • For non-inferiority or one-sided equivalence hypothesis with S0<0S_0<0, the p-value, Pr(ZZdiff|H0)\Pr(Z \leq Z_\text{diff} \, | \, H_0), is computed based on ZdiffZ_\text{diff} using the standard normal distribution.

  • For two-sided superiority test, the p-value Pr(χdiff2χ12|H0)\Pr(\chi_\text{diff}^2 \leq \chi_1^2 \, | \, H_0), is computed based on χdiff2\chi_\text{diff}^2 using the chi-square distribution with 1 degree of freedom, where χdiff2=Zdiff2\chi_\text{diff}^2=Z_\text{diff}^2.

Stratified analysis of M&N method

Assume the data includes two treatment groups, test and control, and collected based on a stratified design. Within each stratum there are two independent binomial samples with binary response variables to be analyzed/summarized. The parameter of interest is the difference between the population proportions of the test and the control groups. The analysis and summaries need to be performed while adjusting for the stratifying variables.

Confidence interval

The confidence interval is based on the M&N method and given by the roots for PD=P1P0PD=P_1-P_0 of the equation:

χα2=(p̂1*p̂0*PD)2i=1I(Wi/k=1KWk)2Ṽi\chi_\alpha^2 = \frac{(\hat{p}_1^*-\hat{p}_0^*-PD)^2}{\sum_{i=1}^I(W_i/\sum_{k=1}^{K}W_k)^2\tilde{V}_i},

where p̂s*=i=1I(Wi/k=1KWk)p̂si\hat{p}_s^* = \sum_{i=1}^I(W_i/\sum_{k=1}^KW_k)\hat{p}_{s i} for s=0,1s = 0, 1;

Ṽi=[p̃1i(1p̃1i)n1i+p̃0i(1p̃0i)n0i]n1i+n0in1i+n0i1\tilde{V}_i=\bigg[\frac{\tilde{p}_{1i}(1-\tilde{p}_{1i})}{n_{1i}}+\frac{\tilde{p}_{0i}(1-\tilde{p}_{0i})}{n_{0i}}\bigg]\frac{n_{1i}+n_{0i}}{n_{1i}+n_{0i}-1};

  • WiW_i is the weight for the ii-th strata;
  • $I = K = $ number of strata, i=k=i=k= strata;
  • n1in_{1i} and n0in_{0i} are the sample sizes in ii-th strata for the test and control group, respectively;
  • p̂1i\hat{p}_{1i} and p̂0i\hat{p}_{0i} = observed proportion in ii-th strata for the test and control, respectively;
  • p̃0i\tilde{p}_{0i} and p̃1i\tilde{p}_{1i} are MLE for P0iP_{0i} and P1iP_{1i}, respectively, computed under the constraint p̃1i=p̃0i+PD\tilde{p}_{1i}=\tilde{p}_{0i}+PD.

Similarly as for unstratified analysis,the 2-sided 100(1α)100(1 - \alpha)% CI is given by the roots for PD=P1P0PD = P_1 - P_0, and the bisection algorithm is used in the function to obtain the two roots (confidence interval) for PDPD.

p-value and Z-statistic

The Z-statistic is computed as:

Zdiff=p̂1*p̂0*+S0i=1I(Wi/k=1KWk)2ṼiZ_\text{diff}=\frac{\hat{p}_1^*-\hat{p}^*_0+S_0}{\sqrt{\sum_{i=1}^I(W_i/\sum_{k=1}^{K}W_k)^2\tilde{V}_i}} where S0S_0 is pre-specified proportion difference under the null;

  • p̃0i\tilde{p}_{0i} and p̃1i\tilde{p}_{1i} are MLE for P0iP_{0i} and P1iP_{1i}, respectively, computed under the constraint p̃1i=p̃0i+S0\tilde{p}_{1i} = \tilde{p}_{0i} + S_0.

The p-value can be calculated as stated above.

Example

Load package

Data simulation

We simulated a dataset with 2 treatment group for binary output. If stratum is used, we considered 4 stratum.

ana <- data.frame(
  treatment = c(rep(0, 100), rep(1, 100)),
  response  = c(rep(0, 80), rep(1, 20), rep(0, 40), rep(1, 60)),
  stratum   = c(rep(1:4, 12), 1, 3, 3, 1, rep(1:4, 12), rep(1:4, 25))
)

head(ana)
#>   treatment response stratum
#> 1         0        0       1
#> 2         0        0       2
#> 3         0        0       3
#> 4         0        0       4
#> 5         0        0       1
#> 6         0        0       2

Unstratified analysis

The function computes the risk difference, Z-statistic, p-value given the type of test, and two-sided 100(1α)100(1 - \alpha)% confidence interval of difference between two rates.

rate_compare(response ~ treatment, data = ana)
#>   est  z_score            p    lower     upper
#> 1 0.4 5.759051 4.229411e-09 0.269662 0.5165743

Stratified analysis

The sample size weighting is often used in the clinical trial. Below is the function to conduct stratified MN analysis with sample size weights.

We also support weight in "equal" and "cmh". More details can be found in the rate_compare() documentation.

rate_compare(
  formula = response ~ treatment, strata = stratum, data = ana,
  weight = "ss"
)
#>         est  z_score            p     lower     upper
#> 1 0.3998397 5.712797 5.556727e-09 0.2684383 0.5172779

References

Miettinen, Olli, and Markku Nurminen. 1985. “Comparative Analysis of Two Rates.” Statistics in Medicine 4 (2): 213–26.