Nature is complex. This seems like an obvious statement, but too often we reduce it to straightforward models.
y ~ x and that sort of thing. Not that there’s anything wrong with that: sometimes
y is actually directly a function of
x and anything else would be, in the words of Brian McGill, ‘statistical machismo.’
But I would wager that, more often that not,
y is not directly a function of
x . Rather,
y may be affected by a host of direct and indirect factors, which themselves affect one another directly and indirectly. If only there was someway to translate this network of interacting factors into a statistical framework to better and more realistically understand nature. Oh wait, structural equation modeling.
[Updated March 10, 2016: You can find more materials relating to SEM, including lectures, example analyses, and R code here: https://jonlefcheck.net/teaching/].
[Updated October 13, 2015: Active development has moved to my piecewiseSEM package on Github, so please see the link for the latest versions of all functions.]
What is structural equation modeling?
Structural equation modeling, or SEM for short, is a statistical framework that, ‘unites two or more structural models to model multivariate relationships,’ in the words of Jim Grace. Here, multivariate relationships refers to the sum of direct and indirect interactions among variables. In practice, it essentially strings together a series of linear relationships.
These relationships can be represented using a path diagram with arrows denoting which variables are influencing (and influenced by) other variables. Take a very simple path diagram:
Which is described by the equation:
Y ~ X1 + X2.
Not consider a different arrangement of the same variables:
Which is now described by two equations:
X2 ~ X1 and
Y ~ X2. These two equations make up the SEM.
Assumptions of SEM
There are a few important points to make:
(1) As I pointed out, I’m talking about linear relationships. There are ways to integrate non-linear relationships into SEM (see Cardinale et al. 2009 for an excellent example), but for the most part, we’re dealing with straight lines.
(2) Second, we assume that the relationships above are causal. By causal, I mean that we have structured our models such that we assume
X1 directly affects
Y , and not the other way around. This is a big leap from more traditional statistics where we the phrase ‘correlation does not imply causation’ is drilled into our brains. The reason we can infer causative relationships has to do with the underlying philosophy of SEM. SEM is designed principally to test competing hypotheses about complex relationships.
In other words, I might hypothesize that
Y is directly influenced by both
X2 , as in the first example above. Alternately, I might suspect that
X1 indirectly affects
X2 . I might have arrived at these hypotheses through prior experimentation or expert knowledge of the system, or I may have derived them from theory (which itself is supported, in most cases, by experimentation and expert knowledge).
Either way, I have mathematically specified a series of causal assumptions in a causal model (e.g.,
Y ~ X1 + X2 ) which I can then test with data. Path significance, the goodness-of-fit, and the magnitude of the relationships all then have some bearing on which logical statements can be extracted from the test:
X1 does not directly affect
Y , etc. For more information on the debate over causation in SEMs, see chapters by Judea Pearl and Kenneth Bollen.
(3) Third, in the second example, we have not estimated a direct effect of
Y (i.e., there is no arrow from
Y , as in the first example). We can, however, estimate the indirect effect as the product of the effect of
X2 and the effect of
Y . Thus, by constructing a causal model, we can identify both direct and indirect effects simultaneously, which could be tremendously useful in quantifying cascading effects.
Traditional SEM attempts to reproduce the entire variance-covariance matrix among all variables, given the specified model. The discrepancy between the observed and predicted matrices defines the model’s goodness-of-fit. Parameter estimates are derived that minimize this discrepancy, typically using something like maximum likelihood.
There are, however, a few shortcomings to traditional SEM. First, it assumes that all variables are derived from a normal distribution.
Second, it assumes that all observations are independent. In other words, there is no underlying structure to the data. These assumptions are often violated in ecological research. Count data are derived from a Poisson distribution. We often design experiments that have hierarchical structure (e.g., blocking) or observational studies where certain sets of variables are more related than others (spatially, temporally, etc.)
Finally, traditional SEM requires quite a large sample size: at least (at least!) 5 samples per estimated parameter, but preferably 10 or more. This issue can be particularly problematic if variables are nested, where the sample size is limited to the use of variables at the highest level of the hierarchy. Such an approach drastically reduces the power of any analysis, not to mention the heartbreak of collapsing painstakingly collected data.
These violations are dealt with regularly by, for instance, fitting generalized linear models to a range of different distributions. We account for hierarchy by nesting variables in a mixed model framework, which also alleviates issues of sample size. However, these techniques are not amenable to traditional SEM.
These limitations have led to the development of an alternative kind of SEM called piecewise structural equation modeling. Here, paths are estimated in individual models, and then pieced together to construct the causal model. Piecewise SEM is actually a better gateway into SEM because it essentially is running a series of linear models, and ecologists know linear models.
Evaluating Fit in Piecewise SEM: Tests of d-separation
Piecewise SEM is a more flexible and potentially more powerful technique for the reasons outlined above, but it comes with its own set of restrictions. First, estimating goodness-of-fit and comparing models is not straightforward. In traditional SEM, we can easily derived a chi-squared statistic that describes the degree of agreement among the observed and predicted variance-covariance matrices. We can’t do that here, because we’re estimating a separate variance-covariance matrix for each piecewise model.
Enter Bill Shipley, who derived two criteria of model fit for piecewise SEM. The first is his test of d-separation. In essence, this test asks whether any paths are missing from the model, and whether the model would be improved with the inclusion of the missing path(s). To understand this better, we must first define a few terms.
Shipley’s perspective is based on what are known as directed acyclic graphs (DAG), which are essentially path diagrams, as above. Two variables are considered causally dependent if there is an arrow between them. They are causally independent if there is not an arrow between them. Consider the following example:
X1 is causally independent of
Y2 because no arrow exists between them.
X1 indirectly influences
Y1 . Thus,
X1 is causally independent of
Y2 conditional on
Y1 . This is an important distinction, because it has implications for the way in which we test whether the missing arrow between
Y2 is important.
A test of directional separation (d-sep) asks whether the causally independent paths are significant while statistically controlling for variables on which these paths are conditional. To begin, one must list all the pairs of variables with no arrows between them, then list all the other variables that are direct causes of either variable in the pair. These pairs of independence claims, and their conditional variables, forms the basis set.
For the example above, the basis set is:
Y2, conditional on
Y2, conditional on
The basis set can then be turned into a series of linear models:
X1 ~ X2
X1 ~ Y2 + Y1
X2 ~ Y2 + Y1
As you can see, we include the conditional variables (
Y2 ) as covariates, but we are interested in those missing relationships (e.g.,
X1 ~ X2). We can run these models, and extract the p-values associated with the missing path, all the while controlling for
Y2. From these p-values, we can calculate a Fisher’s C statistic using the following equation:
which follows a chi-squared distribution with 2k degrees of freedom (where k = the number of pairs in the basis set). Thus, if a chi-squared tests is run on the C statistic and P < 0.05, then the model is not a good fit. In other words, one or more of the missing paths contains some useful information. Conversely, if P > 0.05, then the model represents the data well, and no paths are missing.
Evaluating Fit in Piecewise SEM: AIC
SEM is often implemented in a model testing framework (as opposed to a more exploratory analysis). In other words, we are taking a priori models of how we think the world works, and testing them against each other: reversing paths, dropping variables or relationships, and so on. A popular way to compare nested models is the use of the Akaike Information Criterion (AIC). Shipley has recently extended d-sep tests to use AIC. It’s actually quite straightforward:
The formula should look quite familiar if you’ve ever calculated an AIC. In this case, the gobbledy gook in brackets is simply C, above. In this case, however, K is not the number of pairs in the basis set (that’s little k, not big K). Rather, K the number of parameters estimated across all models. The additive term can be modified to provide an estimate of AIC corrected for small sample size (AICc).
An Example from Shipley (2009)
In his 2009 paper, Shipley sets forth an example of tree growth and survival among 20 sites varying in latitude using the following DAG:
Happily for our purposes, Shipley has provided the raw data as a supplement, so we construct, test, and evaluate this SEM. In fact, I’ve written a function that takes a list of models, performs tests of d-sep, and calculates a chi-squared statistic and an AIC value for the entire model. The function alleviates some of the frustration with having to derive the basis set–an exercise at which I have never reliably performed–and then craft each model individually, which can be cumbersome with many variables and missing paths. You can find the most up to date version of the function on GitHub.
First, we load up the required packages and the data:
Then we load the data and build the model set:
# Load required libraries library(lmerTest) library(nlme) data(shipley2009) shipley2009.modlist = list( lme(DD~lat, random = ~1|site/tree, na.action = na.omit, data = Shipley), lme(Date~DD, random = ~1|site/tree, na.action = na.omit, data = Shipley), lme(Growth~Date, random = ~1|site/tree, na.action = na.omit, data = Shipley), glmer(Live~Growth+(1|site)+(1|tree), family=binomial(link = "logit"), data = shipley2009) )
Finally, we can derive the fit statistics:
which reveals that C = 11.54 and P = 0.48, implying that this model is a good fit to the data. If we wanted to compare the model, the AIC score is 49.54 and the AICc is 50.08.
*These values differ from those reported in Shipley (2009) as the result of updates to the R packages for mixed models, and the fact that he did not technically correctly model survivorship as a binomial outcome, as that functionality is not implemented in the nlme package. I use the lmerTest package at it returns p-values for merMod objects based on the oft-debated Satterthwaite approximation popular in SAS, but not in R.
Shortcomings of Piecewise SEM
There are a few drawbacks to a piecewise approach. First, the entire issue of whether p-values can be meaningfully calculated for mixed models rears its ugly head once again (see famous post by Doug Bates here). Second, piecewise SEM cannot handle latent variables, which are variables that represent an unmeasured variable that is informed by measured variables (akin to using PCA to collapse environmental data, and using the first axis to represent the ‘environment’). Third, we cannot accurately test for d-separation in models with feedbacks (e.g., A -> B -> C -> A). Finally, I have run into issues where tests of d-separation cannot be run because the model is ‘fully identified,’ aka there are no missing paths. In such a model, all variables are causally dependent. One can, however, calculate an AIC value for these models.
Get to it!
Ultimately, I find SEM to be a uniquely powerful tool for thinking about, testing, and predicting complex natural systems. Piecewise SEM is a gentle introduction since most ecologists are familiar with constructing generalized linear models and basic mixed effects models. Please also let me know if anything I’ve written here is erroneous (including the code on GitHub!). These are complex topics, and I don’t claim to be any kind of expert! I’m afraid I’ve also only scratched the surface. For an excellent introductory resource to SEM, see Jarrett Byrnes’ SEM course website–and take the course if you ever get a chance, its fantastic! I hope I’ve done this topic sufficient justice to get you interested, and to try it out. Check out the references below and get started.
Shipley, Bill. “Confirmatory path analysis in a generalized multilevel context.” Ecology 90.2 (2009): 363-368.
Shipley, Bill. “The AIC model selection method applied to path analytic models compared using a d-separation test.” Ecology 94.3 (2013): 560-564.
Shipley, Bill. Cause and correlation in biology: a user’s guide to path analysis, structural equations and causal inference. Cambridge University Press, 2002.