Hi all,
first of all thanks a lot for the library. It’s an incredible tool and I found it very useful so far.
I’m not a big expert in the field though, so please forgive me if the question is trivial. I tried to browse both google, Discourse and the codebase without finding an answer.
I want to model some positive non-zero integers (counts > 0) as samples from a custom distribution depending on 3 parameters, and infer the parameters posterior distributions.
To do so I wrote the following:
with pymc.Model():
mu = pymc.Normal("mu", mu=-1, sigma=1)
r = pymc.Exponential("r", lam=1)
k = pymc.Gamma("k", alpha=np.log(samples.max()), beta=1)
pymc.DensityDist("obs", mu, r, k, observed=samples, logp=customlogp)
trace = pymc.sample(...)
where customlogp(x, mu, r, k)
is an appropriate logp function and samples
is the array of ints. Everything works fine and I get the estimate of my parameters. Happy.
Now, I make my model more complex, making the hypothesis that the counts I observe have been reduced by some process. I thought that a possible strategy could be to add another layer to the model, making my observed samples drawn by a BetaBinomial having the previous random variable as a maximum possible outcome (please if you think this does not make sense or there is a better way to do it, let me know :)).
I write the following:
with pymc.Model():
mu = pymc.Normal("mu", mu=-1, sigma=1)
r = pymc.Exponential("r", lam=1)
k = pymc.Gamma("k", alpha=np.log(samples.max()), beta=1)
counts = pymc.DensityDist("ab", mu, r, k, logp=gompertz)
alpha=pymc.Gamma("alpha",1000, 1)
beta=pymc.Gamma("beta",1000, 1)
pymc.BetaBinomial("obs", alpha=alpha, beta=beta, n=counts, observed=samples)
trace = pymc.sample(..)
which breaks because apparently “ab” gets float values, both positives and negatives.
SamplingError: Initial evaluation of model at starting point failed!
Starting values:
{'mu': array(-0.16700715), 'r_log__': array(0.6423041), 'k_log__': array(0.9295108), 'ab': array(-0.89165907), 'alpha_log__': array(6.0785945), 'beta_log__': array(6.55535832)}
which makes sense, I didn’t specify anywhere counts
should be giving positive integers.
I tried to fix this by specifying the random
kwarg for DensityDist
but apparently the sampler completely ignores it and still produces float values, positive and negative.
Which is the correct way to specify a discrete DensityDist with finite support?
Thanks a lot for your help