Hi all,
I am updating some code which I initially wrote during the pymc3/pymc4 transition. It seemed back then that things would get simpler based on a preliminary version of v4.0.0, but apparently plans changed.
In particular, I am confused by the apparent decision to prohibit allowing a different number of “samples” in a future release. (see here and here, though I admit I could not follow the latter discussion).
Here is my use case:
- my model has multiple thetas, so it would be grate to have multivariate posteriors
- it is a linear model, the intercept and all the slopes are multiplied with
pm.Data
objects (for the former, this is justones
). - sampling works fine; I have a model and a trace
- I then want to do perform out of sample prediction: I have a subset of my observations which was excluded in model fitting; hence I adjust data with
pm.set_data()
- I would then like to have
n
predictions for each of those novel observations; emphasis thatn
can be an arbitrary (high) number: it is independent of the number of original observations, the number of posterior predictions, or the number of chains.
Is there any recent example of how to do this? Or could someone provide one?
Before v4, I had the workaround of iterating my novel observations, repeating the so that the data shape matches the original data, and then sampling repeatedly to get a sufficienntly high number of predictions, always a multiple of the number of original observations. But that is increadibly cumbersome.
I am posting this here because I fear that, discussing the changes to sample_posterior_predictive', you might not have had the use case of
set_data` out-of-sample prediction on your radar.
Thanks a lot!
Falk
(PS: my actual data is more involved; you can find all details here and here)