Emcee-style blobs


The concept of blobs doesn’t seem to exist in PyMC3 (see https://emcee.readthedocs.io/en/latest/user/blobs).
Am I wrong? Would it be a reasonable feature request?



The only thing that vaguely resembles the blobs you linked above are the variational inference’s callbacks.

If you are interested in adding such a feature with a PR, you are more than welcome. My opinion is that you could maybe raise the feature request issue in pymc4’s repo, because it is still in a very early phase, so adding blobs might be easier there than in pymc3.

Depending on what you need access to, you should be able to do this with pm.Deterministic. For example, the emcee example for the eight schools model can be done like this:

import numpy as np
import pymc3 as pm
import theano.tensor as tt

J = 8
y = np.array([28.,  8., -3.,  7., -1.,  1., 18., 12.])
sigma = np.array([15., 10., 16., 11.,  9., 11., 10., 18.])

with pm.Model() as model:
    mu = pm.Normal('mu', mu=0, sd=5)
    tau = pm.HalfCauchy('tau', beta=5)
    theta = pm.Normal('theta', mu=mu, sd=tau, shape=J)
    obs = pm.Normal('obs', mu=theta, sd=sigma, observed=y)
    logp = pm.Deterministic('logp', tt.sum([tt.sum(v.distribution.logp(v)) for v in model.vars]))
    log_lik = pm.Deterministic('log_lik', tt.sum([tt.sum(v.distribution.logp(v)) for v in model.observed_RVs]))
    trace = pm.sample()

We already gather the posterior log probability in sampler_stats, and you can verify that they all match up:

(trace.get_sampler_stats('model_logp') - (trace['logp'] + trace['log_lik'])).max()  # 0.0
1 Like

Thanks, pm.Deterministic seems to be the way to go indeed. I wanted something completely arbitrary returned with the log likelihood, like this:

import numpy as np
import pymc3 as pm

def gaussian(x, mu, sig):
    return np.exp(-np.power(x - mu, 2.) / (2 * np.power(sig, 2.)))

def logp(x):
    l1 = gaussian(x, -10, 1)
    l2 = gaussian(x, 50, 10)
    blob = 'my_blob'
    return np.log(l1 + l2 / 50), blob

with pm.Model() as model:
    X = pm.Uniform('X', lower=-100., upper=100., testval=0.)
    lgp, blb = logp(X)
    llk = pm.Potential('llk', lgp)

    blob = pm.Deterministic('blob', blb)

But this fails with:

AttributeError                            Traceback (most recent call last)
<ipython-input-59-563568a1bb76> in <module>
      4     llk = pm.Potential('llk', lgp)
----> 6     blob = pm.Deterministic('blob', blb)

~/soft/miniconda3/lib/python3.7/site-packages/pymc3/model.py in Deterministic(name, var, model)
   1431     """
   1432     model = modelcontext(model)
-> 1433     var = var.copy(model.name_for(name))
   1434     model.deterministics.append(var)
   1435     model.add_random_variable(var)

AttributeError: 'str' object has no attribute 'copy'

Does blb need to be wrapped into a special object?

Yikes. I believe it has to be a theano tensor, and I do not know if there are string types.

You’re right, wrapping the blob with theano.tensor._shared(blob) works (as long as the blob is a numerical NumPy array):

from theano.tensor import _shared

def logp(x):
    l1 = gaussian(x, -10, 1)
    l2 = gaussian(x, 50, 10)
    blob = np.array([0, 1, 2])
    return np.log(l1 + l2 / 50), _shared(blob)
1 Like