Bayesian Neural Network

Input [11] shows two ways to generate posterior predictions: either by using sample_posterior_predictive (the universal way), or building the posterior sample by replacing the input matrix in the computational graph (the alternative way you can do only in VI). The later is generating samples in the theano graph so it is faster. The beginning of input [11] shows how you can replace part of the graph to archive this.

There is no particular reason to use 500 samples, it is not a hyperparameter. You can improve the estimation by using more samples.