# Sample size estimation for balanced randomised control trials

This post presents an R function to implement the sample size estimation presented in Twisk (2013) for continuous outcomes.

Formula 11.1 is used when the researcher wants to compare two groups at one single point in time (e.g., placebo versus treatment at post-intervention), whereas Formula 11.3 is used when there is more than one follow-up measurement and the researcher is interested in comparing the two groups based on the average in the outcome variable over the total follow-up period.

## Formulae

The formula for a between-groups difference at a single time point is:

$N = \frac{(Z_{(1-\alpha/2)} + Z_{(1-\beta)})^2 \times \sigma^2 \times 2}{v^2}\qquad(11.1)$

where $Z_{(1-\alpha/2)}$ is the $(1-\alpha/2)$ percentile point of the standard deviation, $Z_{(1-\beta)}$ is the $(1-\beta)$ percentile point of the standard normal distribution, $\sigma$ is the standard deviation of the the outcome variable, $v$ is the between-groups difference in mean value of the outcome variable, and $N$ is the sample size per group.

The formula for examining the effect of the intervention on average over the total follow-up period is:

$N = \frac{(Z_{(1-\alpha/2)} + Z_{(1-\beta)})^2 \times \sigma^2 \times 2[1+(T-1)\rho ]}{v^2T}\qquad(11.3)$

where $N$ is the sample size per group, $Z_{(1-\alpha/2)}$ is the $(1-\alpha/2)$ percentile point of the standard deviation, $Z_{(1-\beta)}$ is the $(1-\beta)$ percentile point of the standard normal distribution, $\sigma$ is the standard deviation of the the outcome variable, $T$ is the number of follow-up measurements, $\rho$ is the correlation coefficient of repeated measures, $v$ is the between-groups difference in mean value of the outcome variable, and $N$ is the sample size per group.

Both formulas contain the expression $\frac{\sigma^2}{v^2}$, which is equivalent to $\frac{1}{d^2}$, where $d$ is the Cohen’s d effect size .

Therefore, the formulas simplify to:

$N = \frac{(Z_{(1-\alpha/2)} + Z_{(1-\beta)})^2 \times 2}{d^2}\qquad(11.1a)$

$N = \frac{(Z_{(1-\alpha/2)} + Z_{(1-\beta)})^2 \times \sigma^2 \times 2[1+(T-1)\rho ]}{d^2T}\qquad(11.3a)$

As a result, the R code asked for the between-groups Cohen’s d effect size, which is classified as small (d = .2), medium (d = .5), and large (d = .8).

This function also adjusted the estimated sample size for a user-specified attrition rate.

## Example 1. Single Comparison Point, No Attrition Rate

> expsample(d = 0.5, alpha = 0.05, power = 0.80)

Parameters Specified
--------------------
Alpha: 0.05
Power: 0.8
Effect size (Cohen's d): 0.5
Number of outcome measures: 1

Required Sample Size
--------------------
Number of participants per group: 63
Total participants required: 126


## Example 2. Single Comparison Point, 20% (0.20) Attrition Rate

>expsample(d = 0.5, alpha = 0.05, power = 0.80, attr = 0.20)

Parameters Specified
--------------------
Alpha: 0.05
Power: 0.8
Effect size (Cohen's d): 0.5
Number of outcome measures: 1

Required Sample Size
--------------------
Number of participants per group: 63
Total participants required: 126

Sample size adjusted for attrition rate
---------------------------------------
Expected attrition rate: 0.2
Number of participants per group: 79
Total participants required: 158


## Example 3. Average Over Total (2) Follow-Up Period, No Attrition

>expsample(d = 0.5, alpha = 0.05, power = 0.80, avg = T, points = 2, rho = 0.7)

Parameters Specified
--------------------
Alpha: 0.05
Power: 0.8
Effect size (Cohen's d): 0.5
Number of outcome measures: 1
Average over the total follow-up period: TRUE
Numer of follow-up measurements: 2
Correlation between repeated measures: 0.7

Required Sample Size
--------------------
Number of participants per group: 54
Total participants required: 108


## Reference

Twisk, J. W. R. (2013). Applied Longitudinal Data Analysis for Epidemiology. (2e). Cambridge, UK: Cambridge University Press.

This site uses Akismet to reduce spam. Learn how your comment data is processed.