I have a model in bambi, and I can get the predictive distribution for new data in this model into a dataframe by running model.predict(…).to_dataframe(). But when I do, I get a long list of responses for each draw and each observation. Is there a way to list the observation, draw, and chain that generated each prediction?
I am not super familiar with bambi, but I think the call to predict()
returns a standard arviz.InferenceData
object and you are probably only interested in the posterior_predictive
group in that object. So instead of converting the entire object to a dataframe, you probably just want that group. Given that you are converting the return value of predict()
, I assumed you have set the inplace
argument to False
:
ppc = model.predict(idata, kind='pps', inplace=False)['posterior_predictive']
# convert to pandas dataframe if you like
print(ppc.to_dataframe())
Given the defaults, I think the idea is to use the inferenceData object as cumulative storage:
idata = model.fit()
model.predict(idata, kind='pps')
print(idata['posterior_predictive'].to_dataframe())
@cluhmann points in the right direction.
Model.predict()
modifies or creates an arviz.InferenceData
object. When you use kind="mean"
it adds a new variable to the .posterior
group (the name of the new variable is the name of the response with _mean
appended). If you use kind="pps"
it obtains posterior predictive samples, and it is added to the .posterior_predictive
group.