Analysis of Bivariate Regressions

Dear Adam,

(1) In a way, yes, the slope is the global linear rate of change, so you could say “strongest predictor”. However, none of the predictors alone can give you the whole story, and they might be correlated. Thus it is hard to isolate a predictive effect - I would tend to report all effects and avoid judgement.
How much something predicts depends also on the distribution of values and the cross-effect to other variables (correlation of the values and even of their slopes). This holds true even for standardized values as yours. For illustration: let’s pretend the “Percent Black/Hispanic” plot would have the strongest slope. Most of the values cluster at >0.5. Even the high slope would not help you to predict big differences between most of the schools in your data set.

So it depends on which question is asked. For me, the question of “what is the strongest predictor” does not capture the essence of the problem (multiple predictors). But it can be done, see below.

(2) I would interpret the σ here not as standard deviation, but as a model residual (usually denoted as epsilon). It is not a standard deviation (std’s scale with the number of observations, this one shouldn’t). Don’t mix it with the “range of plausible true values” of your predictor slopes β (95% hpd interval, your green area), and remember that the intercept also has a 95% hpdi which adds on top of that green area.
A lower model residual means that the variable x explains more of the variation in y; yet someone else might have a clearer formulation.

(3) I guess you got the orange intervals from ppc sampling? They indicate how your model fits the data and where prediction is insufficient (outliers). Outliers are okay, since orange covers only 95% of the data, but they should be equally distributed over the parameter range (showing, for example, that in “chronically abscent” predictor, a linear model is maybe a bit coarse, but still okay).

(4) you can learn that the models fit well, but that there were outlier. You also learn about the distributions of your data. So the plots are very informative and well polished.

You could try to combine the models and add beta’s for all your predictors in the same model, possibly even as a multivariate structure to account for correlations (see here). That makes it harder to interpret, but the model should be more accurate.
To estimate which is the strongest predictor, I would then have four extra models where one of all the slopes is left out in turn. Then you can do model comparison / LOO, to compare the predictive power of the reduced models with the full one.

Best,

Falk

2 Likes