Hello, there!
I am trying to replicate the model built by Raftery in the article Inference for the binomial N parameter: A hierarchical Bayes approach using PyMC3.
Problem definition
Let (x_1, ...., x_n) be a sample from Binomial distribution with unknown N and p. That is, x_i | N, \theta \sim Bin(N,\theta), for i = 1,...,n. Assume that N has prior Poisson distribution with mean \mu. Then, as discussed in the paper, x_1, ..., x_n have Poisson distribution with mean \lambda = \mu\cdot \theta.
The paper specifies the improper prior \pi(\lambda, \theta) = \lambda^{-1}, and the proper with \lambda \sim Gamma(\kappa_1, \kappa_2) and \theta \sim Unif(0,1) independents.
Remark: When we consider one sample and the improper prior, for instance, the median of the posterior distribution N|x is 2x.
Application
For instance, suppose we observe \boldsymbol{x} = [15, 20, 21, 23, 26]. What inferecences can we make about N?
What I tried to do
Consider the code:
impala = np.array([15,20, 21, 23, 26])
with pm.Model() as model:
theta = pm.Beta("theta", 1,1)
lambd = pm.Gamma("lambda",1,1)
mu = lambd/theta
BoundPoisson = pm.Bound(pm.Poisson, lower = max(impala))
N = BoundPoisson("N", mu = mu)
x = pm.Binomial("x", n=N, p=theta, observed=impala)
trace = pm.sample(8000, tune = 8000)
display(az.summary(trace))
I calculated \mu using the expression \lambda = \mu\cdot\theta and we bounded the variable N since N \ge x_{max}.
From that, I got the messages
Sampling 2 chains for 8_000 tune and 8_000 draw iterations (16_000 + 16_000 draws total) took 25 seconds.
The acceptance probability does not match the target. It is 0.94 but should be close to 0.8. Try to increase the number of tuning steps.
There were 251 divergences after tuning. Increase `target_accept` or reparameterize.
The acceptance probability does not match the target. It is 0.66 but should be close to 0.8. Try to increase the number of tuning steps.
The estimated number of effective samples is smaller than 200 for some parameters.
mean sd hdi_3% hdi_97% mcse_mean mcse_sd ess_mean ess_sd ess_bulk ess_tail
I could not it when the prior is an improver, that is, \lambda \sim Gamma(0,0).
Remark: The number of divergences changes each time.