Just an update. I have re-run the R code a couple of times and I cannot make it converge again.
Inference for Stan model: anon_model.
4 chains, each with iter=2000; warmup=1000; thin=1;
post-warmup draws per chain=1000, total post-warmup draws=4000.
mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat
lengthscaleS 0.09 0.04 0.06 0.00 0.05 0.11 0.13 0.19 2 2.51
lengthscaleT 17.96 7.41 11.15 0.02 7.02 21.61 25.79 33.40 2 2.80
a 0.16 0.04 0.51 0.07 0.09 0.11 0.13 0.34 198 1.02
mu 0.90 0.00 0.06 0.78 0.86 0.90 0.94 1.02 3837 1.00
I have done some ad-hock prior adjustment to the PyMC model and I get close-ish results in this way:
with pm.Model() as mod:
lengthscaleS = pm.HalfNormal("lengthscaleS", 10) #length scale space
lengthscaleT = pm.HalfNormal("lengthscaleT", 10) #length scale time
a = pm.HalfNormal("a", 0.025) #excitation parameter
gs = gk(spaceD, lengthscaleS) #gaussian kernel
mu0 = pm.HalfNormal("mu0", 1)
mu = pm.HalfNormal("mu", 1)
mu_xyt = mu0 + mu*muhat
lam0 = (a * lengthscaleT * at.exp(-timeD * lengthscaleT)) * gs
lam = (lam0 * at.tril(at.ones(timeD.shape), k=-1)).sum(axis=-1)
lam = pm.Deterministic("lam", mu_xyt + lam)
muSTintegral = len(xyt)
def lp(name, lam, a, lengthscaleT, mu, mu0, time, integral):
l = [at.sum(at.log(lam)) - integral * mu - mu0 * space_window * time_window +
a * at.sum(at.exp(-time*lengthscaleT)) - integral]
return l[0]
y = pm.Potential("y", lp("ylp", lam, a, lengthscaleT, mu, mu0, timeD2, muSTintegral))
with mod:
idata = pm.sample(1000, tune=1000, chains=4, cores=12, random_seed=33) #target_accept=0.95)
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (4 chains in 12 jobs)
NUTS: [lengthscaleS, lengthscaleT, a, mu0, mu]
|████████| 100.00% [8000/8000 04:26<00:00 Sampling 4 chains, 2,191 divergences]Sampling 4 chains for 1_000 tune and 1_000 draw iterations (4_000 + 4_000 draws total) took 292 seconds.
There were 2191 divergences after tuning. Increase `target_accept` or reparameterize.
mean sd hdi_3% ... ess_bulk ess_tail r_hat
lengthscaleS 0.159 0.032 0.101 ... 1401.0 2026.0 1.0
lengthscaleT 20.995 3.177 15.567 ... 1011.0 1432.0 1.0
a 0.138 0.019 0.102 ... 1746.0 1749.0 1.0
mu 0.867 0.061 0.752 ... 1085.0 1396.0 1.0
Note that the adjustment for a (excitation parameter) is quite drastic. And I still get over 2000 divergences (even though Rhats and ESS seem okay).