May 9, 2016
Epi Vignettes: Interaction and effect modificationA brief synopsis of epidemiologic study design and methods with sample analytic code in R. Returning to the Epi Vignettes series I've been writing, this post focuses on the concepts of interaction and effect modification. These concepts are generally introduced in an intermediate level epidemiology course and for the most part are treated synonymously in their interpretation, with the difference being how they are assessed. I was trained that effect modification is assessed via stratification on a third variable, while interaction is assess via a product term in a regression model. While this simplification may be acceptable under most circumstances, there are some subtleties to assessing interaction and effect modification that I recently appreciated after peer review from several manuscripts. The terms in fact should not be used synonymously, as VanderWeele noted, and there is an important distinction. Although this distinction can be demonstrated statistically (as done in preceding article), I find a pragmatic definition more important. For the rest of this post, assume X and M are two exposures of interest, and Y is the outcome. The parameterization of the variables is unimportant as this time. Essentially we can say that effect modification (sometimes called effect heterogeneity, or heterogeneity of effects) occurs when, as a researcher, we have the ability to intervene on only one exposure (X), and interaction (sometimes called joint effects) occurs when we have the ability to intervene on both exposures (X and M). Let's take two examples. Example 1. We are interested in assessing the relationship between race (X), drug use (M), and HIV status (Y). Since we cannot "intervene" on race, but only drug use, an effect modification approach is appropriate. Example 2. We are interested in assessing the relationship between alcohol use (X), drug use (M), and HIV status (Y). Since we can intervene on both alcohol and drug use, an interaction approach is appropriate. Of importance, these are causal definitions; statistically both effect modification and interaction may be assessed with an X*M product term in a regression model. Approaches to testing for effect modification/heterogeneity of effects include stratified tests (e.g., Breslow-Day, Wald) or regression-based tests of interaction. For an excellent review of effect heterogeneity, see this article by Kaufman and MacLehose. Of note, the authors caution not to overly rely on the p-value of these tests as they may be underpowered to detect a difference thus concluding effect homogeneity, but rather compare the magnitude of effects within the strata of variable M. Generally stratum-specific estimates are used in the case of effect modification, especially when M is binary or has a relatively few number of categories. Otherwise the methods become unwieldy. As most researchers seem to prefer a regression based approach to testing for interaction the rest of this blog post will assume a product term (X*M) in a regression model, yet assessing interaction is more than a product term in a regression model: there is statistical interaction (literally the result of the statistical test for the interaction term) and causal interaction (whether intervening on the variables X and/or M will have a causal effect on outcome Y). For now, X*M will be treated as a statistical interaction. Depending upon the type of regression, the statistical interaction may be additive, multiplicative, or both. Additive interaction stipulates the summed effect of X and M will be greater than their individual effects, while multiplicative interaction stipulates that the product of X and M will be greater than their individual effects. As may be inferred in a straight linear regression model, which is on the linear scale, the interaction term represents additive interaction. In a generalized linear regression model such as logistic or Poisson regression, which is on the log scale, the interaction term represents multiplicative interaction. Stated another way, the multiplicative scale is a comparison of relative measures of association across strata (relative risk/odds ratio), while the additive scale is a comparison of difference measures of association across strata (e.g., attributable risk/risk difference). As absolute measures are more frequently used in in public health practice to demonstrate how interventions affect an outcome, it may be preferable to report additive interactions. However, since the X*M term in generalized linear regressions only indicate multiplicative interaction, the additive component must be manually calculated, and depends in part on the parameterization of X and M:
It is at the epidemiologist's discretion which measure of interaction to report from a multiplicative scale model. In practice it may be best to present both the multiplicative and additive interaction measures in a standardized fashion. For a comprehensive review of interaction in epidemiological analyses, see this recent article by VanderWeele and Knol. Sample codes in R Heterogeneity of effects approach #Wald test for heterogeneity of effects between X and Y by strata M model1 = glm(Y ~ X + covariates, data=subset(dataset, M==0), family=binomial(link="logit"))) model2 = glm(Y ~ X + covariates, data=subset(dataset, M==1), family=binomial(link="logit"))) Get estimates and standard errors b1 = summary(model1)$coefficients["X","Estimate"] b1_se = summary(model1)$coefficients["X"," Std. Error"] b2 = summary(model2)$coefficients["X","Estimate"] b2_se = summary(model2)$coefficients["X"," Std. Error"] Compute test statistic (Z-score) zscore = (b2-b1)/(sqrt((b1_se^2) + (b2_se^2))) Check against normal distribution 2*pnorm(-abs(zscore)) Joint effects approach Specify interaction model Interaction_model = glm(Y ~ X*M + covariates, data=dataset, family=binomial(link="logit")) Multiplicative effects are from term X:M in model summary(Interaction_model) If X is continuous let's mean center dataset$X_centered = scale(dataset$X, center=T, scale=F) If M is protective let's assign the lowest risk category as referent dataset$M_additive = relevel(as.factor(dataset$M), ref=2) Calculate additive interaction: additive_left = exp(coef(Interaction_model)["X"] + coef(Interaction_model)["M"] + coef(Interaction_model)["X:M"] - 1) additive_right = exp(coef(Interaction_model)["X"] - 1) + exp(coef(Interaction_model)["M"] - 1) Presence of additive interaction? additive_left == additive_right #none additive_left < additive_right #negative additive_left > additive_right #positive RERI RERI = exp(coef(Interaction_model)["X"] + coef(Interaction_model)["M"] + coef(Interaction_model)["X:M"]) - exp(coef(Interaction_model)["X"]) - exp(coef(Interaction_model)["M"]) + 1 AP AP = RERI / exp(coef(Interaction_model)["X"] + coef(Interaction_model)["M"] + coef(Interaction_model)["X:M"]) Synergy index S = (exp(coef(Interaction_model)["X"] + coef(Interaction_model)["M"] + coef(Interaction_model)["X:M"]) - 1) / ((exp(coef(Interaction_model)["X"]) - 1) + (exp(coef(Interaction_model)["M"]) - 1)) Bootstrap for CI boot_ci = boot(dataset, bootAdditive, 1000, method="Synergy", parallel="multicore", ncpus=4) boot.ci(boot_ci, type="norm", index=1) Bootstrap function bootAdditive = function(data,index,method) { bootdata = data[index,] model = glm(Y ~ X*M + covariates, data=bootdata, family=binomial(link="logit")) if (method=="RERI") { additive = exp(coef(model)["X"] + coef(model)["M"] + coef(model)["X:M"]) - exp(coef(model)["X"]) - exp(coef(model)["M"]) + 1 } else if (method=="AP") { RERI = exp(coef(model)["X"] + coef(model)["M"] + coef(model)["X:M"]) - exp(coef(model)["X"]) - exp(coef(model)["M"]) + 1 additive = RERI / exp(coef(model)["X"] + coef(model)["M"] + coef(model)["X:M"]) } else if (method=="Synergy") { additive = (exp(coef(model)["X"] + coef(model)["M"] + coef(model)["X:M"]) - 1) / ((exp(coef(model)["X"]) - 1) + (exp(coef(model)["M"]) - 1)) } return(as.numeric(additive)) } Cite: Goldstein ND. Epi Vignettes: Interaction and effect modification. May 9, 2016. DOI: 10.17918/goldsteinepi. |