Conceptually, if your data is split into input x, and output y, and you set up a Bayesian model, for example, like

b ~ N(0, 10)

a ~ N(0,10)

sigma ~HN(0,10)

y ~ N(a*x+b, sigma)

I don’t understand conceptually how you do prior predictive check … because you can sample a,b, and sigma … but then y lies in a distribution whose mean depends on x … so we would need some x values as well to get a distribution of y values …

My guess is … you have to provide the inputs x, then you can sample a,b,sigma … and then you can sample y to get, *for any fixed input x*, a distribution of possible y values in N(a*x+b, sigma) …

(But then this gives a distribution for every input x in your data set … which seems too complicated …)

- Is this guess right? (if no, can you correct me please)
- Is it better (more robust) to model the inputs x as well (add more parameters that model the x inputs) and then your prior predictive check would produce a distribution for pairs (x,y) ? What is the standard practice for this …