I was trying to constrain the sum of several variables, but am finding this unexpectedly hard in pymc3. Here is a minimalistic example:
import numpy as np
import pymc3 as pm
with pm.Model() as model:
i = pm.Categorical('i', np.arange(10)+0.)
j = pm.Categorical('j', np.arange(10)+0.)
total = pm.Normal("total", mu= i + j, sigma=1., observed=3.)
trace = pm.sample(1000, tune=1000, discard_tuned_samples=True, return_inferencedata=True)
Here the prompt:
Multiprocess sampling (4 chains in 4 jobs)
CategoricalGibbsMetropolis: [j, i]100.00% [8000/8000 00:04<00:00 Sampling 4 chains, 0 divergences]
Sampling 4 chains for 1_000 tune and 1_000 draw iterations (4_000 + 4_000 draws total) took 5 seconds.
The number of effective samples is smaller than 25% for some parameters.
The algorithm does not converge …
(Now if I look at the trace, the result of i
and j
is as expected, a number between 0 and 9, but the total is completly off the mark, with negative values mostly.)
total (chain, draw, total_dim_0) float64 -1.419 -0.9189 … -2.919 -2.919
It looks like a bug to me. Is there any other way of doing this?
The reason why I am attempting such a task is for a more complex model where I need to first tune sub-components of the models independently from each others, and in a final step, resample from posterior and constrain the sum. I could do that by hand quite easily I think, but I cannot figure out how to do it in pymc3. Anyone ? Thanks much !