Getting observations of distribution variables whilst sampling

Github issue #2020 by 23pointsNorth:

Hi,
I have a model with a set of latent variables, which I am trying to evaluate. So far I have worked in a custom MCMC world and want to switch to HMC/NUTS. However, I was unable to have it as a drop in replacement, as the model variables in PyMC3 are Subtensor{}.0, whereas the ‘simulation’ of the world I am using expects to have actual values to perform the arithmetics.

The latent space can be viewed as a 6 independent variables. Those are evaluated by a ‘black-box’ external function and return a vector of values that need to match a predefined distribution (1D gaussian).

Something along the lines of:

    basic_model = Model()
    with basic_model:
        x = Normal('x', mu=0, sd=1, shape=6)
        # Expected value of outcome
        _, _, q2d, _ = evalWorld(x)
        y = np.linalg.norm(q2d, axis = 1)
        # Likelihood (sampling distribution) of observations
        Y_obs = Normal('Y_obs', mu=target_mu, sd=target_sigma, observed=y)
        # I want y to match N(target_mu, target_sigma)

        step = NUTS()
        # draw n posterior samples
        trace = sample(n, step=step, start=start)

What do I have to do to x in order to pass possible values/samples of x, rather than the distribution variable itself?

I am aware that HMC requires a gradient of Y_obsm which may be a problem, as some aspects of evalWorld may not be differentiable.

You can write a custom theano Op. The theano documentation contains a section about that, and you can have a look at the exoplanet example (which still isn’t a full blog post, but I’m working on :slight_smile:)

You get the real values in the Op.perform method. Op.grad needs to return theano ops for the gradient.

@23pointsNorth you can also find a similar discussion here:
https://github.com/pymc-devs/pymc3/issues/1925