How to make InferenceData returned by sample() aware of the prior and posterior_predictive

DanWeitzenfeld · September 6, 2021, 3:02am

In the arviz docs, there is this example using from_pymc3:

trace = pm.sample(draws, chains=chains)
prior = pm.sample_prior_predictive()
posterior_predictive = pm.sample_posterior_predictive(trace)

pm_data = az.from_pymc3(
    trace=trace,
    prior=prior,
    posterior_predictive=posterior_predictive,
    coords={"school": np.arange(eight_school_data["J"])},
    dims={"theta": ["school"], "theta_tilde": ["school"]},
)

so pm_data is an InferenceData that is aware of the trace, prior, and posterior_predictive.

But soon pm.sample will return an InferenceData by default. How do I make that InferenceData aware of the prior and posterior_predictive?

jhrcook · September 8, 2021, 12:39pm

There is this example in the “Example of InferenceData schema in PyMC3” guide from ArviZ:

dims_pred={
    "slack_comments": ["candidate developer"],
    "github_commits": ["candidate developer"],
    "time_since_joined": ["candidate developer"],
}
with model:
    pm.set_data({"time_since_joined": candidate_devs_time})
    predictions = pm.sample_posterior_predictive(trace)
    az.from_pymc3_predictions(
        predictions, 
        idata_orig=idata_pymc3, 
        inplace=True,
        coords={"candidate developer": candidate_devs},
        dims=dims_pred,
    )

However, this function adds a new property to the InferenceData object called predictions_constant_data and doesn’t change the posterior predictive section.
Therefore, this function doesn’t solve this problem.

Another option is to use the InferenceData.add_groups() method, but this feels hacky. My main concern would be that I am not following standard operating procedures for ArviZ and that will have annoying/misleading results later. Hopefully an ArviZ dev will chime in.

nb_trace.add_groups({"posterior_predictive": daysabs_post_pred})

Edits

Edited to say that the function does not solve this specific problem.
Add .add_groups() method.

OriolAbril · September 8, 2021, 5:02pm

I’d recommend taking a look at A Hierarchical model for Rugby prediction — PyMC3 documentation and A Primer on Bayesian Methods for Multilevel Modeling — PyMC3 documentation notebooks

DanWeitzenfeld · September 8, 2021, 5:11pm

Thanks! It looks like extend and and the idata_orig argument to arviz.from_pymc3_predictions will do what I’m looking for.

Topic		Replies	Views
InferenceData incomplete Questions arviz	12	575	March 17, 2023
Create InferenceData from trace and sampled posterior prediction Questions	2	1576	October 12, 2020
Plotting Sample_Posterior_Predictive output with Arviz v5 arviz	6	1782	January 26, 2023
How to set prior in arviz InferenceData from pm.sample_prior_predictive()? Questions	2	971	April 22, 2021
Trace from Pymc3 being used in Pymc 4.0 v5	11	911	June 10, 2022

How to make InferenceData returned by sample() aware of the prior and posterior_predictive

Related topics