'TransformedDistribution' object has no attribute 'random' during pm.sample_ppc


#1

I am working through this example in the pymc3 docs: Marginalized Gaussian Mixture Model

W = np.array([0.35, 0.4, 0.25])
MU = np.array([0., 2., 5.])
SIGMA = np.array([0.5, 0.5, 1.])
component = np.random.choice(MU.size, size=N, p=W)
x = np.random.normal(MU[component], SIGMA[component], size=N)
x_shared = theano.shared(x)
with pm.Model() as model:
    w = pm.Dirichlet('w', np.ones_like(W))
    mu = pm.Normal('mu', 0., 10., shape=W.size)
    tau = pm.Gamma('tau', 1., 1., shape=W.size)

    x_obs = pm.NormalMixture('x_obs', w, mu, tau=tau, observed=x_shared)

Then I created new/holdout/test data and exchanged it into the shared Theano tensor.

x_shared.set_value(new_obs)

However, when I try to predict the w, mu and tau for the new data, I get the error below.

with model:
    ppc_trace = pm.sample_ppc(trace, vars = model.free_RVs, 5000, random_seed=SEED)
AttributeError: 'TransformedDistribution' object has no attribute 'random'

My investigation below showed that w and tau are the “TransformedDistributions”

vars = model.basic_RVs
for var in vars:
    print(var)
    print(var.distribution)

#2

Actually you are sampling the pm.free_RVs, which in this case is w_stick_breaking__ and tau_log__ and they dont have a random property/function. But nonetheless for transformed distribution we should be able to also add the random method - the way to go is to sample from the basic_RVs, and use the transformed forward value function to get the sample in the transformed space.


#3

@junpenglao I am not sure what you mean by “transformed forward value function”? Could you please elaborate?

In any case, sampling from the basic_RVs (as you suggested) does not work because it includes w_stick_breaking_ and tau_log__ which do not have a random function.


#4

The main inference implemented in PyMC3 (NUTS and ADVI) sample/approxiamate parameters that are on the real line [-inf, inf]. Thus, for bounded parameter (Exponential distribution etc) it will first apply a transformation from its domain to the real line and do sampling/approximation there.

For the random method, it is usually defined not on the transformed space. For the purpose of your example you can do:

...
pm.sample_ppc(trace, vars = [w, mu, tau], 5000, random_seed=SEED)