I’m wondering about how to use the prior predictive distribution (PrPD
) and about its usefulness.
The literature on the subject is vast and I will only quote Gelman (The Prior Can Often Only Be Understood in the Context of the Likelihood
) who says: “A fundamental tool for understanding the effect of the prior on inference before data has been collected is the prior predictive distribution […] The careful application […] leads us to some concrete recommendations of how to choose a prior that ensures robust Bayesian analyzes in practice.”
Ok, but let’s take the concrete example of a very simple beta-binomial model. How can we use the prior predictive distribution in that case?
As an example, suppose I plan an experiment consisting of n=73 Bernoulli draws with parameter \theta and decide to use as prior \beta(0.5, 0.5) distribution. I know that the likelihood is a binomial function depending on the number x of successes.
I then can sample my prior and, for each randomly drawn \theta_i, sample the corresponding likelihood and get a random x_i. As far as I know, the histogram of the x_i thus obtained (varying between 0 and 73) corresponds to the prior predictive distribution which, in my case will look like the \beta(0.5, 0.5) distribution. Is that right?
If so, what “concrete recommendations” could I draw from it? If I then perform my intended experiment and get, say, x=46, how could I use my PrPD
I just built to “choose a prior that ensures robust Bayesian analyzes in practice”. ?
Am I going to say that my PrPD
leads me to predict values of x close to 0 or 73 and that, since my value 46 is not in these “areas”, my prior is unsuitable? And that I should instead use a uniform \beta(1, 1) prior? In this case, I would modulate my prior according to my experience result… is this normal?
Or, is it simply that the consideration of the PrPD
is unsuitable in the case of a too simple Bayesian model, like the one I just talked about?