Hybrid Bayesian Network getting stuck in PyMC3

I’m trying to sample from the joint probability distribution using PyMC3 in a hybrid Bayesian network described in Bayesian Networks: With Examples in R by Marco Scutari and Jean-Baptiste Denis:

The following R code using rjags in the book is fairly straightforward and works just fine. Note that the dependence on the previous crop (PR) has been removed from the example.


model <- "model {
PR ~ dcat(p.PR);
CL ~ dbeta(a.CL, b.CL);
G1 ~ dpois(CL * g.G1[PR]);
G2 ~ dpois(G1 * k.G2);
TR ~ dbern(ilogit((G1 - m.TR)/s.TR));
x.LO <- G2 * (1 - (1 - r.LO) * TR);
LO ~ dnchisqr(d.LO, x.LO);

dat0 <- list(p.PR = c(0.7, 0.2, 0.1),
             a.CL = 3, b.CL = 1,
             g.G1 = c(1, 3, 10),
             k.G2 = 10,
             m.TR = 5, s.TR = 2.5,
             r.LO = 1/3, d.LO = 1)

dat1 <- dat0
dat1$PR <- 1
mopest <- jags.model(file = textConnection(model), data = dat1,
                     quiet = TRUE)
update(mopest, 3000)
sipest <- 
  coda.samples(model = mopest, variable.names = c("LO", "G1", "G2", "TR", "CL"),
               n.iter  =  50000)
summa <- summary(sipest)

It produces the following distribution for G1, which depends only on CL. It looks like a fine Poisson distribution.

However, when I try to replicate these results in PyMC3, G1 remains 0 the entire time using the following, quite similar code.*

import matplotlib.pyplot as plt
%matplotlib inline
import pymc3 as pm

dat = {
    'a.cl': 3, 'b.cl': 1,
    'g.g1': (1,3,10),
    'k.g2': 10,
    'm.tr': 5, 's.tr': 2.5,
    'r.lo': 1/3, 'd.lo': 1

model = pm.Model()
with model:
    pr = pm.Categorical('pr', p=dat['p.pr'])
    cl = pm.Beta('cl', alpha=dat['a.cl'], beta=dat['b.cl'])
    g1 = pm.Poisson('g1', mu=cl*1) #particular choice of pr, makes categorical useless
    g2 = pm.Poisson('g2', mu=g1*dat['k.g2'])
    tr = pm.Bernoulli('tr', p=pm.invlogit((g1 - dat['m.tr'])/dat['s.tr']))
    xlo = pm.Deterministic('xlo',g2*(1-(1-dat['r.lo'])*tr))
    j = pm.Poisson('j', mu=xlo) # no ncx2 function, so we sample this way
                                # https://discourse.pymc.io/t/noncentral-chi-squared-distribution/432
    lo = pm.ChiSquared('lo', nu=dat['d.lo']+2*j)
    trace = pm.sample(draws=5000, njobs=4)


G1 never moves from 0 in this case:

LO is also incorrect, but if G1 is so wrong, then I figure LO doesn’t have a chance. I can’t figure out why the two packages are resulting in such different sampled distributions. Any insight on why PyMC3 gets stuck here?

*I’ve replaced the non-central chi-squared distribution with something a related distribution, but it shouldn’t impact G1, since G1 doesn’t depends on that.

We don’t have a Gibb sampler in PyMC3, and the Metropolis samplers could get stuck in high dimension easily as it tries to update all parameters at once and got proposal rejected most of the time. [Edit]: Metropolis is actually update parameters one by one as default, but still in high-dimension the rejection rate is high.
Nonetheless, you can try the code below which seems to give acceptable result (minor changes based on your implementation):

p_pr = np.asarray([0.7, 0.2, 0.1])
a_cl = 3.
b_cl = 1.
g_g1 = tt.as_tensor_variable(np.asarray([1, 3, 10]))
k_g2 = 10.
m_tr = 5.
s_tr = 2.5
r_lo = 1/3
d_lo = 1.

with pm.Model() as model:
    PR = pm.Categorical('PR', p=p_pr, testval=2)
    CL = pm.Beta('CL', alpha=a_cl, beta=b_cl)
    G1 = pm.Poisson('G1', mu=CL*g_g1[PR])
    G2 = pm.Poisson('G2', mu=G1*k_g2)
    TR = pm.Bernoulli('TR', p=pm.math.invlogit((G1-m_tr)/s_tr))
    x_LO = pm.Deterministic('x_LO', G2*(1-(1-r_lo)*TR))
    j = pm.Poisson('j', mu=x_LO) # no ncx2 function, so we sample this way
                                # https://discourse.pymc.io/t/noncentral-chi-squared-distribution/432
    LO = pm.ChiSquared('LO', nu=d_lo+2*j)
    trace = pm.sample(10000)

1 Like

That looks reasonable. Note though, that by default Metropolis tries to sample each RV separately, unless blocked=True is passed during initialization.

Oh thanks Thomas - somehow I have the impression that the default is block updated.

Ok, so if I’m understanding correctly, fixing g_g1 to be 1 in G1 = pm.Poisson('G1', mu=CL*g_g1[PR]) causes the joint probability space to be really narrow in one dimension, which makes it very difficult for some samplers to explore it? In that case, do I try to select a different sampler? Or is there something else to try?

What drives that impression?

Probably leftover impression from https://github.com/pymc-devs/pymc3/pull/2446 ¯\_(ツ)_/¯

Right, even with blocked=False it would still block-update multi-dimensional RVs. The Gibbs sampling is done on the RV-level only, which is a bit confusing.