NaN occured in optimization in a VonMises mixture model

In this particular use case, I think the problem is the transformation in the mixture. So I try making the two model really identical:

with pm.Model() as model:
    mu_1 = pm.VonMises('mu_1', mu=0, kappa=1)
    kappa_1 = pm.Gamma('kappa_1', 1, 1)
    vm_1 = pm.VonMises.dist(mu=mu_1, kappa=kappa_1)
    w = np.ones(2)*.5 # pm.Dirichlet('w', np.ones(2))
    vm_comps = [vm_1, vm_1]
    vm = pm.Mixture('vm', w, vm_comps)

In this way vm would really be identical to vm_1 in model1, however the ADVI still gives the same error.

The reason is that mixture class does not have a default tranformation. This means the mixture in this case has the support in [-pi, pi], but the approximation has the support in [-inf, inf]. Usually when we are using Mixture with observed it is fine, but in this case sometimes the approximation goes out of support and gives error.
The solution: assign a transformation:

import pymc3.distributions.transforms as tr

with pm.Model() as model:
    mu_1 = pm.VonMises('mu_1', mu=0, kappa=1)
    kappa_1 = pm.Gamma('kappa_1', 1, 1)
    vm_1 = pm.VonMises.dist(mu=mu_1, kappa=kappa_1)
    w = np.ones(2)*.5 # pm.Dirichlet('w', np.ones(2))
    vm_comps = [vm_1, vm_1]
    vm = pm.Mixture('vm', w, vm_comps, transform=tr.circular)

I believe now it should be much more robust. However, my previous comment still applies:

1 Like