Part 1: Mediation Analysis - The How...
Part 2: Moderation Analysis - The When...
Post your Qs on this Padlet...
Mediation occurs when the relationship between a predictor and an outcome, can be explained by their relationship to a third variable: the mediator
Previously, we've looked at simple relationships between a predictor and an outcome:
Mediation occurs when the relationship between a predictor and an outcome, can be explained by their relationship to a third variable: the mediator
Previously, we've looked at simple relationships between a predictor and an outcome:
We can use a mediation model whenever we use a linear model
Can be used in correlational or experimental designs
The mediator used and the pathways should be theoretically driven
But we need to be careful with our interpretations & conclusions:
Easy Error! Even in a mediation, correlation does not equal causation!!
We then partition this Total Effect into:
An Indirect Effect which is the effect of the predictor on the outcome, through the mediator
& a Direct Effect which is the effect of the predictor on the outcome, adjusting for the mediator
The terms for each of these pathways can feel counter-intuitive...
The Total Effect is comprised of the Indirect Effect AND the Direct Effect
Even though we call it the Total Effect, it's still just the simple relationship between predictor and outcome - we haven't accounted for any other variables
By adding in a mediator, we can see how much of this Total Effect is a Direct Effect of our predictor, and how much can be explained by our mediator (i.e., the Indirect Effect)
Easy Error! People often get 'Direct Effect' & 'Total Effect' muddled up!! The total effect does not adjust for any other variables - the direct effect does..
the direct effect adjusts!!!!
SES is a predictor of children's maths attainment (Evans et al., 2018, 2020a, 2020b, 2020c)
There are many possible mechanisms underlying this relationship, which is what we are trying to uncover with mediation
So, starting with the simple relationship between predictor and outcome, we have the Total Effect of SES on children's maths ability:
You should use theory and research to decide what the mediator might be in this scenario
For today, parental involvement in educational activities seems to be a sensible mediator
And so our mediation model is:
Step 1. Does the predictor predict the outcome (Total Effect)
Top Tip! The Indirect Effect is the product of paths a and b - we can just multiply the estimates together to get the Indirect Effect)
We could test this with a multiple regression in R where we adjust for our predictor
sem()
function from the lavaan
package# step 1: define the modelmy_mod <- 'outcome ~ c*predictor + b*mediator mediator ~ a*predictor indirect_effect := a*b total_effect := c + (a*b) '# step 2: fit the model with FIML and robust SEsmy_fit <- lavaan::sem(my_mod, data = my_data, missing = "FIML", estimator = "MLR")# step 3: summarize the modelbroom::glance(my_fit)broom::tidy(my_fit, conf.int = TRUE)
Don't Panic! We'll go over the code in the discovr_10 tutorial & in the skills labs!
# step 1: define the modelmy_mod <- 'maths_att ~ c*ses + b*parent_inv parent_inv ~ a*ses indirect_effect := a*b total_effect := c + (a*b) '# step 2: fit the model with FIML and robust SEsmy_fit <- lavaan::sem(my_mod, data = my_data, missing = "FIML", estimator = "MLR")# step 3: summarize the modelbroom::glance(my_fit)broom::tidy(my_fit, conf.int = TRUE)
# step 3: summarize the modelbroom::tidy(my_fit, conf.int = TRUE)
term | op | label | estimate | std.error | statistic | p.value | conf.low | conf.high |
---|---|---|---|---|---|---|---|---|
maths_att ~ ses | ~ | c | 4.071 | 0.455 | 8.951 | 0 | 3.179 | 4.962 |
maths_att ~ parent_inv | ~ | b | 71.906 | 6.394 | 11.245 | 0 | 59.373 | 84.439 |
parent_inv ~ ses | ~ | a | 0.024 | 0.004 | 6.731 | 0 | 0.017 | 0.031 |
maths_att ~~ maths_att | ~~ | 343190.477 | 23717.965 | 14.470 | 0 | 296704.119 | 389676.835 | |
parent_inv ~~ parent_inv | ~~ | 27.000 | 1.884 | 14.331 | 0 | 23.307 | 30.692 | |
ses ~~ ses | ~~ | 4713.358 | 0.000 | NA | NA | 4713.358 | 4713.358 | |
maths_att ~1 | ~1 | 2077.997 | 325.090 | 6.392 | 0 | 1440.833 | 2715.161 | |
parent_inv ~1 | ~1 | 50.027 | 0.848 | 58.991 | 0 | 48.365 | 51.689 | |
ses ~1 | ~1 | -254.129 | 0.000 | NA | NA | -254.129 | -254.129 | |
indirect_effect := a*b | := | indirect_effect | 1.727 | 0.330 | 5.240 | 0 | 1.081 | 2.373 |
total_effect := c+(a*b) | := | total_effect | 5.798 | 0.472 | 12.281 | 0 | 4.873 | 6.724 |
*these results are fictional
We can interpret the bs in the same way as we would with a linear model:
Path a tells us the effect for the predictor on the mediator
Path b tells us the effect for the mediator on the outcome adjusting for the predictor
Path c (Direct Effect) tells us the effect for the predictor on the outcome adjusting for the mediator
The Indirect Effect tells us whether there is a mediation
The Total Effect tells us the effect of the predictor on the outcome, NOT adjusting for the mediator
To see if we have a significant mediation, we look at the size, the bootstrapped confidence interval, and the p-value of the Indirect Effect
We can have different types of mediation:
Partial mediation is when the Direct Effect (c) is reduced but still significant - there's both an indirect and direct effect of the predictor on the outcome
Full mediation is when Direct Effect (c) reduced to non-significance - the effect of the predictor on the outcomes goes entirely through the mediator
It's generally best to avoid thinking of mediations as being full or partial when based on p-values because of the all-or-nothing conclusions drawn from significance tests, instead, it's better to ask: is the size of the mediation effect substantial enough to care about it?
*these results are fictional
"There was a significant indirect effect of SES, on children's maths attainment, through parental involvement in educational activities, b = 1.73, 95% BCa CI [1.08, 2.37], p < .001."
Pet Peeve! Your result is NOT 'insignificant' - it is nonsignificant!!
So we've looked at how variables are related with mediation
With moderation we're looking at when variables are related
With moderation we can investigate whether the effect of our predictor is the same for all people or whether it differs under different conditions depending on the value of another variable – the moderator
Differences could be the presence of an effect, the size of the effect, or the direction of the effect
A moderator is a variable that affects the relationship between two others - it modifies it
Can be used in correlational or experimental designs with continuous or categorical variables
It's mathematically the same as the interactions you encountered in Discovering Statistics last term - we run a moderation in the same way as a linear model with 3 predictors:
The key conceptual difference is that you are implying a variable (the moderator) alters the relationship between the other two variables
The moderator chosen should be theoretically driven - the same as a mediation
However, the same issues around causation apply here too
We are implying a directional relationship (i.e., the moderator modifies the relationship) but correlation does not equal causation unless you have a causal design
Easy Error! Predictors and moderators are mathematically the same in our model - the only differences are conceptual!
Parental maths anxiety is a predictor of children's maths anxiety
But research suggests that when parents help with maths homework, the effects might be different for parents experiencing maths anxiety, and that under certain conditions helping might actually be harmful
One idea is that when parents have maths anxiety, when helping with maths homework can increase their child's maths anxiety, and parents without maths anxiety have a less negative impact
we're hypothesising that the relationship between parental maths anx and childrens maths anx, changes due to how many hours parents help with hw
could be presence, size, or direction
We can run a linear model with 3 predictors to test this idea
Where we have two main effects and one interaction effect
We must include the main effects for both the predictor and moderator otherwise the interaction and main effects are confounded & a significant interaction can’t be interpreted
If the interaction is significant, then that is evidence of moderation
When using a linear model with multiple predictors, each of the effects (bs) are interpreted when the other variables in the model are 0
So if we had parental maths anxiety and parental homework help as predictors in a non-moderated linear model, we would interpret the b of parental maths anxiety when parental homework help is at 0, and vice versa
In some scenarios this interpretation isn't problematic because 0 is a plausible and meaningful score
But often it doesn't make sense for a predictor to be interpreted in this way because a score of 0 isn't possible or doesn't make sense
The interaction term in our moderation means that the bs for the main effects are usually uninterpretable
with an interaction, we're saying that the effect of our predictor changes, at different levels of our moderator, so if we have an interaction effect, the beta we get for the predictors wil change depending on the values of the other variables in our model
When there is no interaction, this doesn’t matter because the b parameter doesn’t change at different levels of our moderator
To make interpretation easier, we centre our variables by transforming them into deviations around a fixed point - in grand mean centring this 'fixed point' is the overall mean of that measure
We can then interpret our effects at average levels of the other variable
This applies to our predictor and moderator only
Centring is super easy to do in R with mutate() and mean(), we're just taking the overall mean away from each participants individual score - we'll go over the code in your skills lab this week!
Easy Error! You can use centring in any model where it makes sense, we need to use it in moderations for interpretation but we can use it elsewhere too!
# step 1: fit model with our centred variablesmath_anx_lm <- lm(child_maths_anx ~ parent_maths_anx*parent_hw_help, data = ma_data)# step 2: summarize the model with robust SEsbroom::tidy(math_anx_lm, conf.int = TRUE)parameters::model_parameters(math_anx_lm, robust = TRUE, vcov.type = "HC4", digits = 3)
term | estimate | std.error | statistic | p.value | conf.low | conf.high |
---|---|---|---|---|---|---|
(Intercept) | 40.598 | 0.310 | 130.958 | 0 | 39.988 | 41.207 |
parent_maths_anx | 0.507 | 0.018 | 28.099 | 0 | 0.471 | 0.542 |
parent_hw_help | -166.756 | 39.805 | -4.189 | 0 | -245.052 | -88.460 |
parent_maths_anx:parent_hw_help | 23.955 | 3.962 | 6.046 | 0 | 16.161 | 31.749 |
Top Tip! Higher-order effects refer to interactions, and lower-order effects refer to the main effects!
*these results are fictional
If we find a significant moderation effect, we need to follow it up to understand how the relationship between the predictor and the outcome changes at different values of the moderator
Remember that a moderation can occur from differences in effect size, direction, or presence...
Without probing the interaction further, we don't know what's actually happening in this relationship
We can use two techniques to follow up a significant interaction effect:
Simple slopes analysis
Johnson-Neyman interval
These tell us the coefficient for our predictor (i.e., the effect it has on our outcome) at different values of our moderator
Here we compare the relationship between our predictor and our outcome, at low, mean, and high levels of our moderator (using SDs)
We get models of parental maths anxiety and children's maths anxiety at low parental homework help (-1 SD), mean parental homework help (0), and high parental homework help (+1 SD)
When homework help is high (+1 SD from mean), parental maths anxiety is strongly positively related to children's anxiety scores
At the mean of homework help (0), parental maths anxiety is positively related related to children's anxiety scores
When homework help is low (-1SD), parental maths anxiety is more weakly positively related to children's anxiety scores
Instead of only looking at low, mean, and high values of our moderator, we can look at many values of it with the Johnson-Neyman Interval
This interval estimates the model of our predictor and outcome, at lots of different values of our moderator
We get a 'zone of significance', i.e., the interval in which the relationship between our predictor and outcome is significant
JOHNSON-NEYMAN INTERVAL
When parent_hw_help is OUTSIDE the interval [-0.03, -0.02], the slope of parent_hw_help is p < .05.
"At the mean of parental homework help, there was a significant positive relationship between parental maths anxiety and child maths anxiety, b = 0.51, t = 28.10, p < .001, 95% CI [0.47, 0.54].
At the mean of parental maths anxiety, there was a significant negative relationship between parental homework help and child maths anxiety, b = -166.75, t = -4.19, p < .001, 95% CI [-245.05, -88.46].
The interaction between parental maths anxiety and parental homework help was significantly related to child maths anxiety, b = 23.96, t = 6.05, p < .001, 95% CI [16.16, 31.75].
When parental homework help is low (-1 SD), there is a significant positive relationship between parental maths anxiety and child maths anxiety, b = 0.34, 95% CI [0.27, 0.40 ], t = 9.97, p < .001.
At the mean of parental homework help, there is a significant positive relationship between parental maths anxiety and child maths anxiety, b = 0.51, 95% CI [0.47, 0.54 ], t = 30.04, p < .001.
When parental homework help is high (+1 SD), there is a significant positive relationship between parental maths anxiety and child maths anxiety, b = 0.68, 95% CI [0.62, 0.73 ], t = 22.6, p < .001."
Top Tip! If we're using grand mean centring, the effects should be interpreted at the mean values of the variables!
Predictor | b | SE B | t | p |
---|---|---|---|---|
Intercept | 40.598 | 0.310 | 130.958 | <.001 |
Parental Maths Anxiety | 0.507 | 0.018 | 28.099 | <.001 |
Parental Homework Help | -166.756 | 39.805 | -4.189 | <.001 |
Parental Maths Anxiety x Parental Homework Help | 23.955 | 3.962 | 6.046 | <.001 |
Mediation occurs when the relationship between two variables can be explained (either in part or in full), by another variable
Moderation occurs when the relationship between two variables changes as a function of a third variable
& finally, correlation doesn't equal causation!!
Part 1: Mediation Analysis - The How...
Part 2: Moderation Analysis - The When...
Post your Qs on this Padlet...
Keyboard shortcuts
↑, ←, Pg Up, k | Go to previous slide |
↓, →, Pg Dn, Space, j | Go to next slide |
Home | Go to first slide |
End | Go to last slide |
Number + Return | Go to specific slide |
b / m / f | Toggle blackout / mirrored / fullscreen mode |
c | Clone slideshow |
p | Toggle presenter mode |
t | Restart the presentation timer |
?, h | Toggle this help |
Esc | Back to slideshow |