Recommended approach for OOS prediction

owlas · June 17, 2017, 5:29pm

I’m constructing a hierarchical GLM (thanks to the great examples in the notebook folder) but I’m not sure if I’m predicting the out-of-sample data correctly.

The radon notebook takes the trace of the linear regression parameters and uses the mean of those to build a predictive model (see the plots comparing different districts).

On the other hand, the Bayesian NN notebook uses the sample_ppc function to generate predictions and takes then averages over those predictions (see the first contourf plot).

As I understand, the second approach is more rigorous as it takes account of the uncertainty in the model parameters. However, I’m not completely sure and willing to be educated on this. Perhaps these actually achieve slightly different objectives?

twiecki · June 19, 2017, 10:28am

The two do roughly the same thing. In the radon notebook we recreate the model again outside of PyMC3 (simple since it’s just linear) and draw lines given the trace. Note that we could just re-use the existing model specification by using theano.shared variables and change the data before sample_ppc to evaluate over a grid, like the Bayesian NN example does. I think the Bayesian NN approach should be preferred as it’s harder to shoot yourself in the foot.

Topic		Replies	Views
Prediction using sample_ppc in Hierarchical model Questions from_github	6	4971	December 14, 2017
Concepts of Parameter Estimation and Predictions, and Out of Sample Predicted Probability for Logistic Regression Questions	5	1435	May 11, 2018
Prediction using sampling Questions	0	401	September 16, 2019
About Bayesian Neural Network Questions	1	886	February 12, 2018
Sampling a specific country from the hierarchical model example Questions	3	536	May 24, 2019

Recommended approach for OOS prediction

Related topics