Meta-analysis with the 2-step DerSimonian-Laird estimator

A recent study (Langan et al., 2019) suggests that the two-step DerSimonian and Laird (DL2; Dersimonian & Knacker, 2007) estimator for tau-squared displayed the best properties for random-effects models of meta-analyses for continuous data.

You have the option of calculate the traditional Wald-type confidence intervals and p-values or use the Hartung-Knapp/Sidik-Jonkman (HKSJ) method to calculate the confidence intervals and p-values. Langan et al. (2009) recommend using the HKSJ methods as the default unless your meta-analysis contains fewer than five studies of unequal sample sizes or you do not have heterogeneity in effect sizes.

Finally, you can calculate the I-squared statistic and its confidence intervals using the standard method (see Higgins et al., 2003) or the Q-profile method.

Dependencies

The function requires the dplyr and rlang packages.

Arguments

The function takes the following arguments:

yi: Effect size
vi: Variance of effect size
data: data.frame or tibble
lab: Study name
ci: Confidence interval (default = .95)
Q.profile: Use Q.profile method for I-squared (default = FALSE)
pi: Calculate prediction interval (default = FALSE)
hjsj: Use HKSJ adjustment (default = FALSE)

Function

mw_est <- function(yi,
                   vi,
                   data,
                   lab,
                   method = c("DL", "DL2"),
                   ci = .95,
                   Q.profile = FALSE,
                   pi = FALSE,
                   hksj = FALSE) {
  
  ## General Method-of-Moments Estimate for Tau-Squared (Eq. 6)
  .tau2MM <- function(yi, vi, ai) {
    yw <- sum(ai * yi) / sum(ai)
    
    tau2 <- (sum(ai * (yi - yw)^2) - (sum(ai * vi) - sum(ai^2 * vi) / sum(ai))) / (sum(ai) - sum(ai^2) / sum(ai))
    
    # Return only non-negative values
    max(0, tau2)
  }
  
  # Keep complete cases
  data <- data %>%
    select(c({{lab}}, {{ yi }}, {{ vi }})) %>%
    filter_all(all_vars(!is.na(.)))
  
  # Extract variables
  yi <- pull(data, {{ yi }})
  vi <- pull(data, {{ vi }})
  
  # Calculate DL tau2
  ai <- 1 / vi
  tau2 <- .tau2MM(yi = yi, vi = vi, ai = ai)
  
  # Calculate DL2 if requested
  if (method == "DL2") {
    ai <- 1 / (vi + tau2)
    tau2 <- .tau2MM(yi = yi, vi = vi, ai = ai)
  }
  
  ## Calculate inverse variance weights
  wi <- (1 / (vi + tau2)) # Eq. 2
  
  wy <- wi * yi
  
  mw <- sum(wy) / sum(wi) # Eq. 1
  
  # Get number of studies
  k <- length(yi)
  
  # Degrees of freedom
  df <- k - 1
  
  ### Heterogeneity
  
  ## Cochran's Q-statistic
  
  # Use DL inverse-variance weight (ai = 1 / vi)
  ai <- 1 / vi
  
  # Calculate Q-statistic
  Q <- sum(ai * ((yi - (sum(ai * yi) / sum(ai)))^2))
  
  # Calculate p-value
  Q.p <- pchisq(Q, df = df, lower.tail = FALSE)
  
  z.cdf <- qnorm(1 - ((1 - ci) / 2))
  
  # Method used to calculate heterogeneity and confidence intervals
  if (Q.profile) {
    
    ## Q-profile method to calculate heterogeneity and confidence 
    ## intervals (Eq. 14)
    
    ## Calculate I-squared and H
    
    # a. Use DL inverse-variance weight (ai = 1 / vi)
    ai <- 1 / vi
    
    # b. Calculate V
    V <- (k - 1) * sum(ai) / (sum(ai)^2 - sum(ai^2))
    
    # c. Calculate I-squared
    I.sq <- 100 * tau2 / (tau2 + V)
    
    # d. Calculate H-squared
    H <- (tau2 + V) / V
    
    # Calculate confidence intervals if k > 2
    if (k > 2) {
      
      ## Generalised Q-statistics bounds
      Q.lb <- qchisq((1 - ci) / 2, df, lower = TRUE)
      Q.ub <- qchisq((1 - ci) / 2, df, lower = FALSE)
      
      ## Lower confidence intervals
      # Initialise values
      tau2.tmp <- 0
      F.tau2 <- 1
      
      while (F.tau2 > 0) {
        
        # i. calculate weight for tau2
        W <- 1 / (vi + tau2.tmp)
        
        yW <- sum(yi * W) / sum(W)
        
        
        # ii. Calculate delta tau2
        F.tau2 <- sum(W * (yi - yW)^2) - Q.ub
        F.tau2 <- max(F.tau2, 0)
        
        delta.tau2 <- F.tau2 / sum((W^2) * (yi - yW)^2)
        
        # iii. Next iterative step
        tau2.tmp <- tau2.tmp + delta.tau2
      }
      
      tau2.tmp <- max(0, tau2.tmp)
      I.lower <- 100 * tau2.tmp / (tau2.tmp + V)
      H.lower <- (tau2.tmp + V) / V
      
      ## Upper confidence intervals
      # Initialise values
      tau2.tmp <- 0
      F.tau2 <- 1
      
      while (F.tau2 > 0) {
        
        # i. calculate weight for tau2
        W <- 1 / (vi + tau2.tmp)
        
        yW <- sum(yi * W) / sum(W)
        
        # ii. Calculate delta tau2
        F.tau2 <- sum(W * (yi - yW)^2) - Q.lb
        F.tau2 <- max(F.tau2, 0)
        
        delta.tau2 <- F.tau2 / sum((W^2) * (yi - yW)^2)
        
        # iii. Next iterative step
        tau2.tmp <- tau2.tmp + delta.tau2
      }
      
      tau2.tmp <- max(0, tau2.tmp)
      I.upper <- 100 * tau2.tmp / (tau2.tmp + V)
      H.upper <- (tau2.tmp + V) / V
    } else {
      I.lower <- NA
      I.upper <- NA
      H.lower <- NA
      H.upper <- NA
      
      pi.res <- NULL
    }
  } else {
    
    ## I-squared
    I.sq <- max(0, 100 * ((Q - df) / Q))
    
    ## H
    H <- sqrt(Q / df)
    
    # Calculate the SE of ln(H) aka B
    if (k > 2) {
      if (Q > k) {
        B <- 0.5 * ((log(Q) - log(df)) / (sqrt(2 * Q) - sqrt(2 * df - 1)))
      } else {
        B <- sqrt(1 / ((2 * df - 1) * (1 - (1 / (3 * (df - 1)^2)))))
      }
      
      # H confidence intervals
      H.lower <- max(1, exp(log(H) - z.cdf * B))
      H.upper <- exp(log(H) + z.cdf * B)
      
      # I-Squared
      I.L <- exp(0.5 * log(Q / df) - (z.cdf * B))
      I.U <- exp(0.5 * log(Q / df) + (z.cdf * B))
      
      I.lower <- max(0, 100 * ((I.L^2 - 1) / I.L^2))
      I.upper <- max(0, 100 * ((I.U^2 - 1) / I.U^2))
    } else {
      I.lower <- NA
      I.upper <- NA
      
      H.lower <- NA
      H.upper <- NA
      
      pi.res <- NULL
    }
  }
  
  # Save all heterogeneity statistics in a list
  het.test <- list(
    Q = Q, Q.p = Q.p,
    tau2 = tau2,
    I.sq = I.sq, I.lb = I.lower, I.ub = I.upper,
    H = H, H.lb = H.lower, H.ub = H.upper
  )
  
  # HKSJ adjustment
  if (hksj & k >= 3) {
    
    # Calculate HKSJ-adjusted standard error
    sew <- sqrt(sum(wi * (yi - mw)^2) / (df * sum(wi)))
    
    # Calculate t-value
    t <- mw / sew
    
    # Calculate p-value based on t-statistics and k - 1 df
    t.p <- 2 * pt(-abs(t), df = k - 1)
    
    # Output test for summary effect size
    mw.test <- list(t = t, p = t.p)
    
    # Confidence Intervals
    mw.lower <- mw - sew * qt(1 - ((1 - ci) / 2), df = df)
    mw.upper <- mw + sew * qt(1 - ((1 - ci) / 2), df = df)
    
    # Wald-type statistics
  } else {
    
    # Calculate standard error
    sew <- 1 / sqrt(sum(wi)) # Eq. 3
    
    # Calculate z-statistic and p-value
    z <- mw / sew
    z.p <- 2 * pnorm(-abs(z))
    
    # Confidence Intervals
    mw.lower <- mw - z.cdf * sew
    mw.upper <- mw + z.cdf * sew
    
    # Output test for summary effect size
    mw.test <- list(z = z, p = z.p)
  }
  
  # Prediction Interval (based on t-distribution)
  if (all(k > 2, pi)) {
    if (hksj) {
      pi.lower <- mw - sqrt(tau2 + sew^2) * qt(1 - ((1 - ci) / 2), df = df)
      pi.upper <- mw + sqrt(tau2 + sew^2) * qt(1 - ((1 - ci) / 2), df = df)
    } else {
      pi.lower <- mw - sqrt(tau2 + sew^2) * z.cdf
      pi.upper <- mw + sqrt(tau2 + sew^2) * z.cdf
    }
    
    # Output prediction interval
    pi.res <- list(pi.lb = pi.lower, pi.ub = pi.upper)
  } else {
    pi.res <- NULL
  }
  # Print data set
  cat(
    "\n", "Data used for meta-analysis",
    "\n",
    "\n",
    sep = ""
  )
  
  print(data)
  
  cat(
    "\n",
    sep = ""
  )
  
  # Output results as a list
  return(
    c(
      method = method,
      est = mw, se.est = sew,
      mw.test, ci.lb = mw.lower, ci.ub = mw.upper,
      het.test, pi.res
    )
  )
}

Example

We are going to recreate the analyses in DerSimonian and Knacker (2007).

First we recreate the CLASP dataset and then calculate the log odds ratio and its variance.

dat_clasp <- tibble(
  study = c("Baum", "Anand", "CLASP", "Bradley", "Lovet", "Teo"),
  n1i = c(156, 303, 565, 1570, 103, 4659),
  n2i = c(74, 303, 477, 1565, 105, 4650),
  ai = c(5, 5, 12, 69, 9, 313),
  ci = c(8, 17, 9, 94, 11, 352),
  yi = c(-1.2976, -1.2649, 0.1208, -0.3294, -0.2007, -0.1285),
  vi = c(0.3468, 0.2657, 0.1984, 0.0265, 0.2233, 0.0065)
)

Your data should look like the following:

    study  n1i  n2i  ai  ci      yi     vi 
1    Baum  156   74   5   8 -1.2976 0.3468 
2   Anand  303  303   5  17 -1.2649 0.2657 
3   CLASP  565  477  12   9  0.1208 0.1984 
4 Bradley 1570 1565  69  94 -0.3294 0.0265 
5   Lovet  103  105   9  11 -0.2007 0.2233 
6     Teo 4659 4650 313 352 -0.1285 0.0065

Next we run the function. I have requested the DL2 estimator and Q.profile method to recreate the results of DerSimonian and Knacker (2007), who reported a tau-squared of 0.11, a summary effect size (log odds ratio) of -0.37 (SE = 0.19).

mw_est(
  yi = yi,
  vi = vi,
  data = dat_clasp,
  lab = study,
  method = "DL2",
  pi = FALSE,
  Q.profile = TRUE,
  hksj = FALSE
)

You will get the following output, which is consistent with the results of DerSimonian and Knacker (2007).

Data used for meta-analysis

    study      yi     vi 
1    Baum -1.2976 0.3468 
2   Anand -1.2649 0.2657 
3   CLASP  0.1208 0.1984 
4 Bradley -0.3294 0.0265 
5   Lovet -0.2007 0.2233 
6     Teo -0.1285 0.0065 

$method
[1] "DL2"

$est
[1] -0.3654628

$se.est
[1] 0.1900638

$z
[1] -1.922843

$p
[1] 0.05449979

$ci.lb
[1] -0.737981

$ci.ub
[1] 0.007055387

$Q
[1] 9.677867

$Q.p
[1] 0.08489455

$tau2
[1] 0.1058598

$I.sq
[1] 64.72927

$I.lb
[1] 0

$I.ub
[1] 97.18495

$H
[1] 2.835212

$H.lb
[1] 1

$H.ub
[1] 35.5233

A future post will present a Shiny application that presents a nicely-formatted table and forest plot using the above function.

References

Cornell, J. E., Mulrow, C. D., Localio, R., Stack, C. B., Meibohm, A. R., Guallar, E., & Goodman, S. N. (2014). Random-effects meta-analysis of inconsistent effects: A time for change. Annals of Internal Medicine, 160(4), 267–270. https://doi.org/10.7326/M13-2886

DerSimonian, R., & Kacker, R. (2007). Random-effects model for meta-analysis of clinical trials: An update. Contemporary Clinical Trials, 28(2), 105–114. https://doi.org/10.1016/j.cct.2006.04.004

Higgins, J. P. T., Thompson, S. G., Deeks, J. J., & Altman, D. G. (2003). Measuring inconsistency in meta-analyses. BMJ, 327(7414), 557–560. https://doi.org/10.1136/bmj.327.7414.557

Langan, D., Higgins, J. P. T., Jackson, D., Bowden, J., Veroniki, A. A., Kontopantelis, E., … Simmonds, M. (2019). A comparison of heterogeneity variance estimators in simulated random-effects meta-analyses. Research Synthesis Methods, 10(1), 83–98. https://doi.org/10.1002/jrsm.1316