I referred to my hierarchical data, sorry if that was not clear. I want to exclude any reasons besides shrinkage.
For example, if I feed a point estimate (MAP or mean or median or quantile or any single point) to the sample_posterior_predictive, as opposed to the trace, I guessed that the sampling will be affected. If not, why?
find_MAP() is still used in the documentation ( here and here , as linked initially). Thus, if you write “of little use in most scenarios”, what is “little” and what “most”? I would like to understand, not just follow blindly.
As written in the original post, I need an array that has the extra dimension somehow collapsed, but without reducing variability. Randomly choosing one element along that axis, which I suggested, seems to work, but I wanted to make sure if I understood everything correctly.
And to understand that, I thought it’s good to know how the model residual affects the prediction.