Evaluating Variational Inference: Need Access to fitted (AD)VI approximation density

Hello PyMC users,

I would like to implement the recently introduced PSIS diagnostics for evaluating variational inference in PyMC, in particular for ADVI. For the corresponding publication see Yes, but did it work?: Evaluating variational inference, in particular algorithm 1.

For this I need to extract the (log) of the density of the fitted VI approximation at a number of sampled points \theta (samples w.r.t. to the fitted VI approximation). Could you point me to an explanation on how to achieve this?

I also need to evaluate the (log) of the density function p(\theta, y), would that be achieved through logp?

Many thanks in advance, for reference here a snapshot of the algorithm.

After you fit a model using ADVI, you can sample from the approximation:

with NonCentered_eight:
    ADVI1 = pm.fit(20000, method='advi', obj_n_mc=100,
                   obj_optimizer=pm.adagrad(learning_rate=0.1))
    noncentered_ADVI = ADVI1.sample(10000)

Resulting noncentered_ADVI which you can just treat it as a trace object from PyMC3 that contains MCMC samples. Then you evaluate the model logp using each slice of the samples (we call it point) as input:

logp_func = NonCentered_eight.logp
for point in noncentered_ADVI:
    p_theta_y.append(logp_func(point))

If you are stucked, you can have a look at my partial port of the said paper here: https://github.com/junpenglao/Planet_Sakaar_Data_Science/blob/master/WIP/[WIP]%20Comparing%20VI%20approximation.ipynb

1 Like

This is great, many thanks!

Hi! thanks for the info!

But you just explained how we could get logp. However, I still have no clue in getting logq.

That part is in cell[11] of the notebook mentioned above. :wink:

Thanks for the quick reply.

There is one more question concerning logp. I am new to Pymc3’s framework and this question may be silly. I see in your implementation in the notebook, logp(\theta, y), is evaluated using approx.model.logp(point), where \theta is the latent variables, y is the data and point here refers to \theta. However, I don’t see why it is not approx.model.logp(point ,y). Is this because y has been integrated into the model, when we set the observed variable and feed it by the observation?

It this is the case, how could we evaluate p(\theta, \hat{y}), where \hat{y} is not the original observation dataset?

Many thanks!

The observed was set when you are specifying the observed=... in the model.

You can set a observed to a theano.shared variable, and use .set_value(...) to change the value.

Thanks so much.

1 Like