Including a Variable in a Model without it being a free random variable

Hello there,

I am trying to perform HamiltonianMC sampling of a parameter theta from a model. The desired behavior of the HMC algorithm should follow the pseudocode below:

for i in num_HMC_samples:

  • sample z ~ N(0, 1, shape) # i.e randomly sample z for each new sample in the HMC chain
  • use z and theta to calculate a potential - this is the posterior from which we sample theta
  • perform HMC sampling of theta with respect to this potential

The reason for sampling z with each HMC iteration is because the potential from which I am sampling theta involves an expectation term, which I essentially plan on calculating by reparameterizing z, then taking the mean of these values to get an estimate.

My model for this example looks something like this:

with pm.Model() as test_model:
    theta = pm.Flat('theta', shape=n_dims) 
    z = pm.Normal('z', 0, 1, shape=(n_dims, n_samples)) 

    potential = .... calculated from theta, z ...
    pot = pm.Potential('p', potential)

Now I would like to sample only theta from this model using HMC, but every time I try to do so PyMC also tries to perform NUTS sampling for z. Is there a way to get PyMC to not do this while still sampling z for each HMC sample?

In theory you could write:

z = pm.Normal.dist(0, 1, shape=(n_dims, n_samples)) 

And z would be a random node for which inference isn’t performed. BUT we explicitly forbid RandomVariables in our logp graphs, except for the Simulator class

Otherwise I am not sure this would do what you want. z would change in each logp evaluation within a NUTS step.

To avoid this (and the fact we forbid RandomVariables in the logp graph) the easiest way is to create a custom step sampler for z which always returns a draw from the prior. Then each step of NUTS will work on a “frozen” draw from z but no inference will be perfomed on z

with pm.Model() as test_model:
    theta = pm.Flat('theta', shape=n_dims) 
    z = pm.Normal('z', 0, 1, shape=(n_dims, n_samples)) 

    potential = .... calculated from theta, z ...
    pot = pm.Potential('p', potential)

    step = PriorDraws(vars=[z])
    trace = pm.sample(step=step)

There is a fully worked out example of such prior step sampler by @junpenglao here Two-stage Bayesian regression enforcing a fixed distribution (not Just Hierarchical regression) - #15 by junpenglao

Just to clarify, what I want is for z to be sampled randomly from a Normal Distribution with each HMC iteration purely to be used to calculate a posterior, without changing the distribution of z across iterations and without performing inference over z (which I assume means obtaining a chain of samples for z itself)

In this case, would just using

pm.Normal.dist(..) instead of pm.Normal(..)

work?

No you can’t just do that for the reasons I mentioned. You can however create a step sampler that returns draws from the prior instead of the posterior as is usually done. The example in the linked discussion does exactly that.