Below is the R code for conducting mediation analyses using the bootstrap method detailed in Preacher and Hayes (2004). Please note that my function uses the bias corrected and accelerated (BCa) confidence intervals, whereas their SPSS and SAS macros use the bias corrected (BC) confidence intervals. However, they should yield similar results.
Note
This function is redundant and remains posted for archival purposes only.
Readers should refer to Andrew Hayes’ R version of the PROCESS macro. Alternatively, I recommend Keon-Woong Moon’s processR package, which although requires specification of models using lavaan, you can use its Shiny app to run your analyses and get automatically generated statistical diagrams and formatted tables.
The med()
function takes the following arguments:
iv
= independent variable (X). Use:iv = "iv
“dv
= dependent variable (Y). Use:dv = "dv"
mv
= mediating variable (M). Use:mv = "mv"
cv
= covariate(s) if any. Use:cv = "cv.1"
for single covariate orcv = c("cv.1", "cv.2", ..., "cv.n")
for multiple covariates.dec
= number of decimal places (default = 4)rep
= number of bootstrap replications (default = 5000)
Function
# Load required libraries ---- if (!require(pacman)) install.packages("pacman") pacman::p_load(boot, car, MASS, QuantPsych) #---- Bootstrap Mediation Function ---- boot.med <- function(data, formula.m, formula.y, iv, mv, i) { data.resamp <- data[i, ] reg.m <- lm(formula.m, data = data.resamp) reg.y <- lm(formula.y, data = data.resamp) # Make sure that you pull the correct coefficients from each regression equation # With interactions, quadratics, etc, R may reorder the equation so you need to check it. # The a path coefficient a <- coefficients(reg.m)[iv] # The b path coefficient b <- coefficients(reg.y)[mv] # The indirect effect (product of a*b) ind <- a * b # The direct effect (c') dir <- coefficients(reg.y)[iv] # Total effect total <- ind + dir # Pass to bootstrap function boot.med <- return(c(ind, dir, total)) } #---- Mediation Analyses ---- # # data = data frame # iv = independent variable (X) # dv = dependent variable (Y) # mv = mediating variable (M) # cv = covariate(s) if any. Use: cv = c("var.1", "var.2", "var.3", ..., "var.n") # reps = number of bootstrap replications (default = 1000) # med <- function(data, iv, dv, mv, cv, dec = 4, reps = 5000) { # No covariates specified if (missing(cv)) { iv.formula <- as.formula(paste(mv, paste("~"), paste(iv))) mv.formula <- as.formula(paste(dv, paste("~"), paste(mv))) dv.formula <- as.formula(paste(dv, paste("~"), paste(iv))) dvmv.formula <- as.formula( paste(dv, paste("~"), paste(iv), paste("+"), paste(mv)) ) } else { iv.formula <- as.formula( paste(mv, paste("~"), paste(iv), paste("+"), paste(cv, collapse = "+")) ) mv.formula <- as.formula( paste(dv, paste("~"), paste(mv), paste("+"), paste(cv, collapse = "+")) ) dv.formula <- as.formula( paste(dv, paste("~"), paste(iv), paste("+"), paste(cv, collapse = "+")) ) dvmv.formula <- as.formula( paste(dv, paste("~"), paste(iv), paste("+"), paste(mv), paste("+"), paste(cv, collapse = "+")) ) } ## Regressions ---- # IV on MV iv.out <- lm(iv.formula, data = data) # Coefficients iv.coef <- coef(iv.out) # Standard Errors iv.se <- summary(iv.out)$coefficients[, 2] # p-values iv.pr <- 2 * pt( abs(iv.coef / iv.se), df = nrow(iv.out$model) - 2, lower.tail = FALSE ) # Confidence Intervals iv.ll <- iv.coef - 1.96 * iv.se iv.ul <- iv.coef + 1.96 * iv.se # Standardised Betas iv.beta <- lm.beta(iv.out) iv.beta <- cbind(NA, iv.beta) iv.beta <- rbind(NA, iv.beta)[, 2] # R-squared iv.rsquared <- summary(iv.out)$r.squared # p-value for R-squared iv.rsquared.pr <- pf( summary(iv.out)$fstatistic[1], summary(iv.out)$fstatistic[2], summary(iv.out)$fstatistic[3], lower.tail = F ) # Output Table iv.table <- cbind( "Estimate" = round(iv.coef, dec), "SE" = round(iv.se, dec), "Pr(>|t|)" = round( 2 * pt( abs(iv.coef / iv.se), df = nrow(iv.out$model) - 2, lower.tail = FALSE ), dec ), "LL" = round(iv.coef - 1.96 * iv.se, dec), "UL" = round(iv.coef + 1.96 * iv.se, dec), "Beta" = round(iv.beta, dec) ) ### MV on DV ---- mv.out <- lm(mv.formula, data = data) # Coefficients mv.coef <- coef(mv.out) # Standard Errors mv.se <- summary(mv.out)$coefficients[, 2] # p-values mv.pr <- 2 * pt( abs(mv.coef / mv.se), df = nrow(mv.out$model) - 2, lower.tail = FALSE ) # Confidence intervals mv.ll <- mv.coef - 1.96 * mv.se mv.ul <- mv.coef + 1.96 * mv.se # Standardised Betas mv.beta <- lm.beta(mv.out) mv.beta <- cbind(NA, mv.beta) mv.beta <- rbind(NA, mv.beta)[, 2] # R-squared mv.rsquared <- summary(mv.out)$r.squared # p-value for R-squared mv.rsquared.pr <- pf( summary(mv.out)$fstatistic[1], summary(mv.out)$fstatistic[2], summary(mv.out)$fstatistic[3], lower.tail = F ) # Output Table mv.table <- cbind( "Estimate" = round(mv.coef, dec), "SE" = round(mv.se, dec), "Pr(>|t|)" = round(2 * pt(abs(mv.coef / mv.se), df = nrow(mv.out$model) - 2, lower.tail = FALSE ), dec), "LL" = round(mv.coef - 1.96 * mv.se, dec), "UL" = round(mv.coef + 1.96 * mv.se, dec), "Beta" = round(mv.beta, dec) ) ### IV on DV ---- dv.out <- lm(dv.formula, data = data) # Coefficients dv.coef <- coef(dv.out) # Standard Errors dv.se <- summary(dv.out)$coefficients[, 2] # SEs dv.pr <- 2 * pt(abs(dv.coef / dv.se), df = nrow(dv.out$model) - 2, lower.tail = FALSE) # p-values # Confidence Intervals dv.ll <- dv.coef - 1.96 * dv.se dv.ul <- dv.coef + 1.96 * dv.se # Standardised Betas dv.beta <- lm.beta(dv.out) dv.beta <- cbind(NA, dv.beta) dv.beta <- rbind(NA, dv.beta)[, 2] # R-squared dv.rsquared <- summary(dv.out)$r.squared # R2 # p-value for R-squared dv.rsquared.pr <- pf( summary(dv.out)$fstatistic[1], summary(dv.out)$fstatistic[2], summary(dv.out)$fstatistic[3], lower.tail = F ) # Output Table dv.table <- cbind( "Estimate" = round(dv.coef, dec), "SE" = round(dv.se, dec), "Pr(>|t|)" = round(2 * pt(abs(dv.coef / dv.se), df = nrow(dv.out$model) - 2, lower.tail = FALSE ), dec), "LL" = round(dv.coef - 1.96 * dv.se, dec), "UL" = round(dv.coef + 1.96 * dv.se, dec), "Beta" = round(dv.beta, dec) ) ### IV and MV on DV ---- # Coefficients dvmv.out <- lm(dvmv.formula, data = data) dvmv.coef <- coef(dvmv.out) # Standard Errors dvmv.se <- summary(dvmv.out)$coefficients[, 2] dvmv.pr <- 2 * pt( abs(dvmv.coef / dvmv.se), df = nrow(dvmv.out$model) - 2, lower.tail = FALSE) # Confidence Intervales dvmv.ll <- dvmv.coef - 1.96 * dvmv.se dvmv.ul <- dvmv.coef + 1.96 * dvmv.se # Standardised Betas dvmv.beta <- lm.beta(dvmv.out) dvmv.beta <- cbind(NA, dvmv.beta) dvmv.beta <- rbind(NA, dvmv.beta)[, 2] # R-squared dvmv.rsquared <- summary(dvmv.out)$r.squared # p-value for R-squared dvmv.rsquared.pr <- pf( summary(dvmv.out)$fstatistic[1], summary(dvmv.out)$fstatistic[2], summary(dvmv.out)$fstatistic[3], lower.tail = F ) # Output Table dvmv.table <- cbind( "Estimate" = round(dvmv.coef, dec), "SE" = round(dvmv.se, dec), "Pr(>|t|)" = round( 2 * pt( abs(dvmv.coef / dvmv.se), df = nrow(data) - 2, lower.tail = FALSE ), dec), "LL" = round(dvmv.coef - 1.96 * dvmv.se, dec), "UL" = round(dvmv.coef + 1.96 * dvmv.se, dec), "Beta" = round(dvmv.beta, dec) ) ## Bootstrap function ---- medboot.out <- boot( statistic = boot.med, data = data, formula.m = iv.formula, formula.y = dvmv.formula, iv = iv, mv = mv, R = reps, parallel = "multicore" ) # Indirect effect indirect.b <- round(summary(medboot.out)$original[1], dec) # Indirect B indirect.se <- round(summary(medboot.out)$bootSE[1], dec) # Indirect SE indirect.bca <- boot.ci(medboot.out, type = "bca", index = 1) # Indirect BCA indirect.bca.l <- round(indirect.bca$bca[4], dec) # Extract lower BCa CI indirect.bca.u <- round(indirect.bca$bca[5], dec) # Extract upper BCa CI # Direct effect direct.b <- round(summary(medboot.out)$original[2], dec) # Direct B direct.se <- round(summary(medboot.out)$bootSE[2], dec) # Direct SE direct.bca <- boot.ci(medboot.out, type = "bca", index = 2) # indirect direct.bca.l <- round(direct.bca$bca[4], dec) # Extract lower BCa CI direct.bca.u <- round(direct.bca$bca[5], dec) # Extract upper BCa CI # Total effects total.b <- round(summary(medboot.out)$original[3], dec) # Direct B total.se <- round(summary(medboot.out)$bootSE[3], dec) # Direct SE total.bca <- boot.ci(medboot.out, type = "bca", index = 3) # Total Effect total.bca.l <- round(total.bca$bca[4], dec) # Extract lower BCa CI total.bca.u <- round(total.bca$bca[5], dec) # Extract upper BCa CI cat( paste( "\n", "Independent Variable: ", iv, "\n", "Dependent Variable: ", dv, "\n", "Mediator Variable: ", mv, "\n", "\n", "IV (and CVs) on MV: Path a", "\n", sep = "" ) ) print(iv.table) cat( paste( "\n", "R-Squared: ", round(iv.rsquared, dec), ", p-value: ", round(iv.rsquared.pr, dec), "\n", sep = "" ) ) cat(paste("\n", "\n", "MV (and CVs) on DV: Path b", "\n", sep = "" )) print(mv.table) cat( paste( "\n", "R-Squared: ", round(mv.rsquared, dec), ", p-value: ", round(mv.rsquared.pr, dec), "\n", sep = "" ) ) cat( paste( "\n", "\n", "IV (and CVs) on DV: Path c", "\n", sep = "" ) ) print(dv.table) cat( paste( "\n", "R-Squared: ", round(dv.rsquared, dec), ", p-value: ", round(dv.rsquared.pr, dec), sep = "" ) ) cat( paste( "\n", "\n", "IV and MV (and CVs) on DV", "\n", sep = "" ) ) print(dvmv.table) cat( paste( "\n", "R-Squared: ", round(dvmv.rsquared, dec), ", p-value: ", round(dvmv.rsquared.pr, dec), "\n", sep = "" ) ) cat( "\n", "============================================", "\n", "Bootstrap Mediation Effects with 95% BCa CIs", "\n", "============================================", "\n", "\n", "Indirect effect", "\n", "---------------", "\n", paste("B = ", indirect.b, ", SE = ", indirect.se, ", 95% CI [", indirect.bca.l, ", ", indirect.bca.u, "]", sep = ""), "\n", "\n", "Direct effect", "\n", "-------------", "\n", paste("B = ", direct.b, ", SE = ", direct.se, ", 95% CI [", direct.bca.l, ", ", direct.bca.u, "]", sep = ""), "\n", "\n", "Total effect", "\n", "-------------", "\n", paste("B = ", total.b, ", SE = ", total.se, ", 95% CI [", total.bca.l, ", ", total.bca.u, "]", sep = ""), "\n", sep = "" ) }
Example
Assessing whether body image inflexibility (biaaq.total
) mediates the relationship between negative body appearance evaluation (mbsrq.ae
) and avoidance behaviours (bicsi.avoid
) while controlling for age (age
), gender (gender
), and body image investment (asir.comp
).
med( data = body_image, dv = "bicsi.avoid", iv = "mbsrq.ae", mv = "biaaq.total", cv = c("gender", "asir.comp"), dec = 3, reps = 5000 ) Independent Variable: mbsrq.ae Dependent Variable: bicsi.avoid Mediator Variable: biaaq.total IV (and CVs) on MV Estimate SE Pr(>|t|) LL UL Beta (Intercept) 79.286 6.112 0 67.306 91.265 NA mbsrq.ae 6.926 0.890 0 5.181 8.671 0.394 genderMale -6.301 1.774 0 -9.779 -2.824 -0.171 asir.comp -12.881 1.277 0 -15.384 -10.378 -0.518 R-Squared: 0.547, p-value: 0 MV (and CVs) on DV Estimate SE Pr(>|t|) LL UL Beta (Intercept) 2.761 0.268 0.000 2.236 3.286 NA biaaq.total -0.021 0.002 0.000 -0.025 -0.017 -0.737 genderMale -0.272 0.059 0.000 -0.388 -0.157 -0.256 asir.comp -0.173 0.051 0.001 -0.273 -0.073 -0.241 R-Squared: 0.417, p-value: 0 IV (and CVs) on DV Estimate SE Pr(>|t|) LL UL Beta (Intercept) 1.230 0.233 0.000 0.773 1.688 NA mbsrq.ae -0.181 0.034 0.000 -0.248 -0.114 -0.357 genderMale -0.133 0.068 0.051 -0.266 0.000 -0.125 asir.comp 0.084 0.049 0.086 -0.011 0.180 0.117 R-Squared: 0.208, p-value: 0 IV and MV (and CVs) on DV Estimate SE Pr(>|t|) LL UL Beta (Intercept) 2.806 0.269 0.000 2.278 3.334 NA mbsrq.ae -0.043 0.033 0.191 -0.108 0.022 -0.086 biaaq.total -0.020 0.002 0.000 -0.024 -0.015 -0.688 genderMale -0.258 0.060 0.000 -0.375 -0.141 -0.243 asir.comp -0.172 0.051 0.001 -0.272 -0.072 -0.239 R-Squared: 0.422, p-value: 0 ============================================ Bootstrap Mediation Effects with 95% BCa CIs ============================================ Indirect effect --------------- B = -0.138, SE = 0.024, 95% CI [-0.194, -0.098] Direct effect ------------- B = -0.043, SE = 0.038, 95% CI [-0.126, 0.025] Total effect ------------- B = -0.181, SE = 0.038, 95% CI [-0.259, -0.108]
Write up of results in APA format
Controlling for the gender, age, and body image investment, the mediation analysis showed that appearance evaluation had a significant total effect on appearance-fixing (B = -0.18, 95% CI [-0.26, -0.11], p < .05), a non-significant direct effect (B = -0.04, 95% CI [-0.13, 0.03], p = n.s.), and a significant indirect effect (B = -0.14, 95% CI [-0.19, -0.10], p < .05). This pattern of results suggests that body image inflexibility fully mediates the relationship between appearance evaluation and avoidance behaviours.
Reference
Preacher, K.J., & Hayes, A.F. (2004). SPSS and SAS procedures for estimating indirect effects in simple Mediation models. Behavior Research Methods, Instruments, & Computers, 36, 717-731.