Besides normality of residuals and homogeneity of variance, one of the biggest assumptions of linear modeling is independence of predictors. If one or more of the predictors in a model are correlated, then the model may produce unstable parameter estimates with highly inflated standard errors, resulting in an overall significant model with no significant predictors. In other words, bad news if your goal is to try and determine the contribution of each predictor in explaining the response. But there is hope!
For those (few?) of you who are marine benthic ecologists, you may be familiar with the size-fractionated abundance-to-biomass conversion equations proposed by Graham Edgar for benthic/epifaunal invertebrates. Basically, he derived a general equation relating epifaunal abundance to biomass (in mg AFDM), and biomass to rates of secondary production (in ug AFDM per day). For those of you working with small benthic invertebrates, these equations can cut out a lot of work in having to ash each species, with the added benefit of being able to preserve and retain all specimens. However, it can be a tad unwieldy to implement, unless of course, you’re dealing with R!
Often in ecological research, we are interested not only in comparing univariate descriptors of communities, like diversity (such as in my previous post), but also in how the constituent species — or the composition — changes from one community to the next.
One common tool to do this is non-metric multidimensional scaling, or NMDS. The goal of NMDS is to collapse information from multiple dimensions (e.g, from multiple communities, sites, etc.) into just a few, so that they can be visualized and interpreted. Unlike other ordination techniques that rely on (primarily Euclidean) distances, such as Principal Coordinates Analysis, NMDS uses rank orders, and thus is an extremely flexible technique that can accommodate a variety of different kinds of data.
For decades, biologists and ecologists have largely characterized biological diversity using metrics based on entropy, a concept rooted in information theory that suggests one can quantify the degree of uncertainty associated with predicting bits and pieces of information. In ecology, this has boiled down to determining whether species drawn from a community are the same or different. The metrics will sound familiar to anyone who has taken an introductory ecology class–the Shannon index, Simpson diversity–but Lou Jost, Anne Chao, and others have highlighted the fact that the non-linearity of these indices may lead researchers to grossly misinterpret the underlying diversity of the community in question.