Model scoring in pymc

Ansul · September 11, 2023, 1:08pm

Hello - I have a question on how to score a pymc model on a different dataset. Here is how I have structured my pymc model -

def model_factory(data1):
    with Model() as example_model:
    ....
    ....
    return example_model

And then to generate trace I use the following code -

with model_factory(data1):
    trace = pm.sampling_jax.sample_numpyro_nuts(
            draws=1000,
            tune=1000,
            chains=4,
            random_seed=1111,
            target_accept=0.95
    )

To generate posterior predictive samples, I use the following code -

with model_factory(data1):
    predictive_preds = pm.sample_posterior_predictive(trace=trace, random_seed=1234).to_dataframe(include_coords = False)

My question is - How to generate samples on a different dataset(data2) while using the trace which has been generated using data1? I have been just feeding the “with model_factory” stack with a data2 like this -

with model_factory(data2):
    predictive_all_zeroed = pm.sample_posterior_predictive(trace=trace, random_seed=1234).to_dataframe(include_coords = False)

Is this the right way of scoring the model on a different dataset which was not used for trace generation?

cluhmann · September 11, 2023, 3:41pm

Have you seen the Out-Of-Sample Predictions notebook? If not, I would probably suggest starting there.

Topic		Replies	Views
Model scoring on out-of-samples predictions v5	1	201	November 26, 2023
Sample_posterior_predictive Questions	1	917	May 2, 2019
How to make out-of-sample predictions with pymc model v5	1	653	February 8, 2023
Sample_posterior_predictive() works fine in PyMC 3, raises exception in v4 v5	4	586	October 10, 2022
Different posterior predictive results after loading saved model version agnostic prediction	3	149	April 29, 2024

Model scoring in pymc

Related topics