Thanks for the critical eye here.
I’m not sure where your R2 is coming from, but the eye-ball test would seem to prefer the skew normal model.
I was used the az.r2_score
for each model fit:
>>>az.r2_score(bcDist, ppc[‘y_pred’])
r2 0.516865
r2_std 0.057976
When we run the same on the skewed model, it reports a score of R2 ~ 0.13.
In general, my own preference is to model the data according to (what I believe are) the characteristics of actual generative processes while also considering what your goals are.
I agree with this, of course. So some info, to explain my doubts: this data is from an ecological distance decay study of tree communities, from a mountainous tropical region. Each data point is a comparison between the plants at one site with another site: X is the distance between them in meters, Y is the dissimilarity coefficient, where 1 is totally different plants and 0 all-same plants.
While building this model, I’m trying to account for two major influences on the pattern of decay of similarity: First is the expected behavior of dissimilarity from a geographical perspective. Generally, comparisons that are close (= low x-values) are expected to be very similar, and then comparisons will approach dissimilarity (“y”) = 1 with distance. But a second process “noises up” this simple behavior: within each of the steep micro-basins of the study region, there is extreme variation of habitat, causing a full range of differences between site comparisons in short distances.
Data should be allowed to shift your posterior from from your priors. If this isn’t possible, you can do without data.
I understand this, modifying beliefs with data is fundamental to the bayesian approach and why I’m drawn to bayesian methods. Not arguing this. But the worry here is that I am forcing an unrealistic model for the variance on the data, and that it is weakening the utility of the model in other respects. Part of the reason I am suspicious of the large shiftsfrom the priors I set is that the resulting posterior for kappa doesn’t fit the existing literature at all, or common sense - there hasn’t been an ecosystem observed yet with a halving of similarity (“\kappa” in my model) of ~12 m, which is the mean of the posterior distribution for \kappa in the above skewed normal model (except maybe a museum greenhouse!?).
I guess, in my largely-bayesian-ignorant way, I am asking: isn’t it possible that by enforcing a skewed model on the early portion of the data where the curve of the mean is extremely steep (say, x<500 in this case), I am driving the curve of the mean even higher to accommodate a long tail of variance beneath it? Hence the “right angle” shape?
The toy data you were using previously had no upper bound at y=1 , so the symmetric noise seemed fine.
Yes, I broke up the challenges I was having into a couple different example problems, to be handled one-by-one. I saved this y<1 problem for this stage. I wasn’t not overly concerned by this theoretical boundary, because I didn’t want to bow too much to the constraints of the metric (Bray-Curtis dissimilarity), at the risk of ignoring the ecological processes that we are trying to model.
The mean of the skew normal is not μ…
Okay, understood. The variances plotted above are probably actually correct, they are hpds generated from the posterior predictive distributions with az.plot_hpd()
To access the true mean of the skew-normal for plotting, etc, would I take mean of the values of “y_pred” in the trace of the posterior, in the above model? It sounds like this might give a more sensible (human-readable) way of looking at the results of the skew normal model.