Recommended approach for OOS prediction

I’m constructing a hierarchical GLM (thanks to the great examples in the notebook folder) but I’m not sure if I’m predicting the out-of-sample data correctly.

The radon notebook takes the trace of the linear regression parameters and uses the mean of those to build a predictive model (see the plots comparing different districts).

On the other hand, the Bayesian NN notebook uses the sample_ppc function to generate predictions and takes then averages over those predictions (see the first contourf plot).

As I understand, the second approach is more rigorous as it takes account of the uncertainty in the model parameters. However, I’m not completely sure and willing to be educated on this. Perhaps these actually achieve slightly different objectives?

The two do roughly the same thing. In the radon notebook we recreate the model again outside of PyMC3 (simple since it’s just linear) and draw lines given the trace. Note that we could just re-use the existing model specification by using theano.shared variables and change the data before sample_ppc to evaluate over a grid, like the Bayesian NN example does. I think the Bayesian NN approach should be preferred as it’s harder to shoot yourself in the foot.