Unexpected prior predictive behaviour

It looks like you are cranking out a bunch of samples, plugging them back into your model to get a bunch of predictions (for each category), taking the mean of these predictions, and then plotting these means. These means might be interest (e.g., to make sure things aren’t totally wacky), but it essentially ignores how the uncertainty present in your prior/posterior is propagated through your model and entails uncertainty in the predictions themselves (which is often the focus with predictive sampling).

To visualize this uncertainty, I would plot a small number of (e.g., 100) predictions (separately for each category). After that, I would just calculate means and SDs of (all) the predictions for each category. Or plot a histogram of (all) the predictions (separately for each category). Or all of the above!

Right now, seaborn is plotting the mean of the predictions (separately for each category) and the fact that these means are close to 0.0, but not exactly at zero is probably not a big deal (e.g., this discrepancy is much smaller than the SDs in your priors over alpha and beta), but you won’t know for sure until you figure how much uncertainty (variability) there is in each category’s predictions.

Also note that seaborn is constructing error bars/bands that reflect confidence intervals of the mean. CIs are expected to shrink with increasing sample size and do not reflect the variability you are (probably) most interested in.

2 Likes