Adding a Deterministic variable after sampling

mike-lawrence · November 3, 2023, 6:32am

I have a model where there are a number of variables derived from the parameters that are of interest, but declaring them all Deterministic so they appear in the InferenceData output causes slower sampling than if declare them as Deterministic. I know I can do things by hand with the Xarray representations of the posterior variables, but I’m curious if there’s a more straightforward way involving adding them as new Deterministic-declared variables to the model object after sampling and calling something akin to pm.sample_posterior_predictive(trace) (but without any stochasticity). Or is there genuinely no option but to compute them by hand with the Xarrays?

ricardoV94 · November 3, 2023, 9:44am

You can indeed use sample_posterior_predictive without any stochasticity (unless the deterministic depends on an observed variable, does it?).

ricardoV94 · November 3, 2023, 10:53am

Here is a snippet:

import pymc as pm
import numpy as np

from pymc.model.fgraph import clone_model

with pm.Model() as m:
    x = pm.Beta("x", 1, 1)
    idata = pm.sample(progressbar=False)
    
with clone_model(m) as clone_m:
    det = pm.Deterministic("det", clone_m["x"] + 1)
    pp = pm.sample_posterior_predictive(idata, var_names=["det"], progressbar=False)
    
assert np.all(pp.posterior_predictive["det"] == idata.posterior["x"] + 1)

Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (4 chains in 4 jobs)
NUTS: [x]
Sampling 4 chains for 1_000 tune and 1_000 draw iterations (4_000 + 4_000 draws total) took 2 seconds.
Sampling: []

Node the second Sampling: [] corresponding to sample_posterior_predictive, meaning there is nothing stochastic in it.

I used clone_model to not modify the original model, but if you are not concerned about it you can add the Deterministic on the original model after sampling.

If you need a Deterministic that depends on the values of the observed variable, you can replace the observed variable by it’s data with pm.do instead of just doing clone_model.

mike-lawrence · November 4, 2023, 12:22am

Do the new deterministic variables have to be functions of parameter variables or other existing deterministic variables as in this case? In my case, I have a whole chain of derived variables and it’s only the last that I’d like the values for after sampling, akin to:

with pm.Model() as model:
    # priors
    ...
    # derived variables
    ...
    interesting_dv = ... # interesting dv computed from earlier uninteresting vars
    # likelihood
    ...

ricardoV94 · November 4, 2023, 9:09am

Yes, because the values from the trace will be used as inputs in the posterior predictive function.

You can always have a helper function that returns the uninteresting variables from the model variables to avoid code duplication if that’s the concern?

mike-lawrence · November 4, 2023, 11:48pm

Hm, I’m getting different values for even Deterministic quantities when I declare them in this way. Here’s a minimal example:

import pymc as pm
from pymc.model.fgraph import clone_model

with pm.Model() as model:
    data = pm.ConstantData('data',[0])
    mu = pm.Normal('mu', mu=0, sigma=1)
    mu_squared1 = pm.Deterministic('mu_squared1', mu**2)
    y = pm.Normal('y', mu=mu, sigma=1, observed=data)


with clone_model(model) as model2:
    mu_squared2 = pm.Deterministic('mu_squared2', mu**2)


with model2:
    trace = pm.sample_prior_predictive(
        samples = 1
        , var_names = ['mu','mu_squared1','mu_squared2']
    )

Which yields:

>>> trace['prior']['mu']
array([[-1.44118118]])

>>> trace['prior']['mu_squared1']
array([[2.0770032]])

>>> trace['prior']['mu_squared2']
array([[0.25616674]])

mike-lawrence · November 4, 2023, 11:57pm

Oh, I failed to notice that one can’t simply refer to variables as normal when cloning but have to refer to them as elements of the cloned model object, so this works as expected:

import pymc as pm
from pymc.model.fgraph import clone_model

with pm.Model() as model:
    data = pm.ConstantData('data',[0])
    mu = pm.Normal('mu', mu=0, sigma=1)
    mu_squared1 = pm.Deterministic('mu_squared1', mu**2)
    y = pm.Normal('y', mu=mu, sigma=1, observed=data)


with clone_model(model) as model2:
    mu_squared2 = pm.Deterministic('mu_squared2', model2['mu']**2)


with model2:
    trace = pm.sample_prior_predictive(
        samples = 1
        , var_names = ['mu','mu_squared1','mu_squared2']
    )

Topic		Replies	Views
Deterministic and observed RV behaviour when using sample_posterior_predictive Questions	5	1085	January 24, 2019
Deterministic posterior predictive?	5	544	November 14, 2023
Should I add pm.Deterministic to my model? modeling	6	599	June 6, 2024
Bug in fast sample posterior predictive? Questions	9	1504	March 14, 2021
Deterministic with observables changes the dimensions of the variables, why? version agnostic development , modeling	5	1127	July 26, 2022

Adding a Deterministic variable after sampling

Related topics