Tutorial: running minibatch advi on PyMC v4

Hello,

I’m quite new to PyMC and am trying out the GLM minibatch tutorial on PyMC v4.

The approximation is fitted without any issues. But sampling the approximation to arviz causes an error.

The line that causes the error:

idata_advi = az.from_pymc3(approx.sample(2000))

error trace:

---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
/tmp/ipykernel_2197406/717053261.py in <module>
     69     # run ADVI with minibatch
     70     approx = pm.fit(100000, callbacks=[pm.callbacks.CheckParametersConvergence(tolerance=1e-4)])
---> 71     idata_advi = az.from_pymc3(approx.sample(2000))

~/.local/lib/python3.8/site-packages/arviz/data/io_pymc3_3x.py in from_pymc3(trace, prior, posterior_predictive, log_likelihood, coords, dims, model, save_warmup, density_dist_obs)
    578     InferenceData
    579     """
--> 580     return PyMC3Converter(
    581         trace=trace,
    582         prior=prior,

~/.local/lib/python3.8/site-packages/arviz/data/io_pymc3_3x.py in __init__(self, trace, prior, posterior_predictive, log_likelihood, predictions, coords, dims, model, save_warmup, density_dist_obs)
     73         density_dist_obs: bool = True,
     74     ):
---> 75         import pymc3
     76 
     77         try:

ModuleNotFoundError: No module named 'pymc3'

From what I understand, it is because arviz is trying to use pymc3 instead of v4.

Is this tutorial outdated? or does something require a fix?

To update on this issue, after installing pymc3 (which I think should not really be required given that v4 is already there) the error changes to the following:

ValueError: Using the `InferenceData` as a `trace` argument won't work. Please use the `arviz.InferenceData.extend` method to extend the `InferenceData` with groups from another `InferenceData`.

I’ll try to find a fix for this. But to me, it seems that the tutorial is somewhat not correct and needs to be fixed.

What sort of solved the issue for me was changing the problematic line to this:

idata_advi = approx.sample(2000)

The reason I say “sort of solved”, is that now idata_advi is an InferenceData object, but only has a posterior.

However, this InferenceData object has no log likelihood function, hence trying to compute something like WAIC raises an error.

TypeError: log likelihood not found in inference data object

We decided not to include the log_likelihood in VI on purpose. This is explained in another issue, maybe I can find it: Error when comparing ADVI models with az.compare - #4 by ferrine

1 Like