I’m trying to sample from the joint probability distribution using PyMC3 in a hybrid Bayesian network described in Bayesian Networks: With Examples in R by Marco Scutari and Jean-Baptiste Denis:
The following R code using rjags in the book is fairly straightforward and works just fine. Note that the dependence on the previous crop (PR) has been removed from the example.
library(rjags)
model <- "model {
PR ~ dcat(p.PR);
CL ~ dbeta(a.CL, b.CL);
G1 ~ dpois(CL * g.G1[PR]);
G2 ~ dpois(G1 * k.G2);
TR ~ dbern(ilogit((G1 - m.TR)/s.TR));
x.LO <- G2 * (1 - (1 - r.LO) * TR);
LO ~ dnchisqr(d.LO, x.LO);
}"
dat0 <- list(p.PR = c(0.7, 0.2, 0.1),
a.CL = 3, b.CL = 1,
g.G1 = c(1, 3, 10),
k.G2 = 10,
m.TR = 5, s.TR = 2.5,
r.LO = 1/3, d.LO = 1)
dat1 <- dat0
dat1$PR <- 1
mopest <- jags.model(file = textConnection(model), data = dat1,
quiet = TRUE)
update(mopest, 3000)
sipest <-
coda.samples(model = mopest, variable.names = c("LO", "G1", "G2", "TR", "CL"),
n.iter = 50000)
summa <- summary(sipest)
print(summa)
plot(sipest[,"G1"])
It produces the following distribution for G1, which depends only on CL. It looks like a fine Poisson distribution.
However, when I try to replicate these results in PyMC3, G1 remains 0 the entire time using the following, quite similar code.*
import matplotlib.pyplot as plt
%matplotlib inline
import pymc3 as pm
dat = {
'p.pr':(0.7,0.2,0.1),
'a.cl': 3, 'b.cl': 1,
'g.g1': (1,3,10),
'k.g2': 10,
'm.tr': 5, 's.tr': 2.5,
'r.lo': 1/3, 'd.lo': 1
}
model = pm.Model()
with model:
pr = pm.Categorical('pr', p=dat['p.pr'])
cl = pm.Beta('cl', alpha=dat['a.cl'], beta=dat['b.cl'])
g1 = pm.Poisson('g1', mu=cl*1) #particular choice of pr, makes categorical useless
g2 = pm.Poisson('g2', mu=g1*dat['k.g2'])
tr = pm.Bernoulli('tr', p=pm.invlogit((g1 - dat['m.tr'])/dat['s.tr']))
xlo = pm.Deterministic('xlo',g2*(1-(1-dat['r.lo'])*tr))
j = pm.Poisson('j', mu=xlo) # no ncx2 function, so we sample this way
# https://discourse.pymc.io/t/noncentral-chi-squared-distribution/432
lo = pm.ChiSquared('lo', nu=dat['d.lo']+2*j)
trace = pm.sample(draws=5000, njobs=4)
print(pm.stats.summary(trace))
pm.plots.traceplot(trace)
G1 never moves from 0 in this case:
LO is also incorrect, but if G1 is so wrong, then I figure LO doesn’t have a chance. I can’t figure out why the two packages are resulting in such different sampled distributions. Any insight on why PyMC3 gets stuck here?
*I’ve replaced the non-central chi-squared distribution with something a related distribution, but it shouldn’t impact G1, since G1 doesn’t depends on that.