I renamed `BetaBinomial`

to `Gb1Binomial`

and adapted the `logp`

and `_random`

methods but my chains completely fail to mix , the draw rate is ~2 draws per second, and the estimates are completely wrong. I wonder if I have a mistranslation from Numpy to Theano, or if thereβs a bug in my PyMC class, or if my probability analysis is just wrong.

A set of three scripts are at https://gist.github.com/fasiha/5bf34e1882fa09ea9b60f32dfd38540f but Iβll summarize them here:

The `Gb1Binomial`

distribution is similar to `BetaBinomial`

, in that p \sim Beta(\alpha, \beta) except instead of k \sim Binomial(n, p), I have k \sim Binomial(n, p^\delta) for \delta > 0.

It can be generated in the following way, very similar to BetaBinomial:

```
def genGb1Binom(Ndata, ALPHA, BETA, DELTA, N):
n = np.ones(Ndata, dtype=int) * N
p = beta.rvs(ALPHA, BETA, size=Ndata)**DELTA
return binom.rvs(n, p)
```

Its likelihood is more complicated than `BetaBinomial`

; it needs a couple of helper functions: `binomln`

is included in PyMC but I need a sign-aware `logsumexp`

(like Scipyβs), so I include both here:

```
def binomln(n, k):
return -betaln(1 + n - k, 1 + k) - np.log(n + 1)
def logsumexp(x, signs):
"Adaptation of PyMC's logsumexp, but can take a list of signs like Scipy"
x_max = np.max(x)
result = np.sum(signs * np.exp(x - x_max))
return np.log(result if result >= 0 else -result) + x_max
def logp(k, alpha, beta, delta, n):
i = np.arange(n - k + 1.0)
mags = binomln(n - k, i) + betaln(delta * (i + k) + alpha, beta)
signs = (-1.0)**i
return binomln(n, k) - betaln(alpha, beta) + logsumexp(mags, signs)
```

Iβve validated this scheme of drawing random variables and this `logp`

in `validate_gb1_logp.py`

in the above Gist.

Then I ported it to a PyMC class (`gb1binomial.py`

in the Gist).

The `_random`

method is almost exactly the same as `BetaBinomial`

's, except I have `_p = _p**delta`

, which is the sole difference between that and `Gb1Binomial`

.

Then I had to adapt the `logp`

to Theano tensors, and this is what I came up with: first a Theano port of the `logsumexp`

helper and then `logp`

:

```
def logsumexp(x, signs, axis=None):
"Adaptation of PyMC's logsumexp, but can take a list of signs like Scipy"
x_max = tt.max(x, axis=axis, keepdims=True)
result = tt.sum(signs * tt.exp(x - x_max), axis=axis, keepdims=True)
return tt.switch(result >= 0,
tt.log(result) + x_max,
tt.log(-result) + x_max)
def logp(self, value):
def mapper(n, k, alpha, beta, delta):
i = tt.arange(n - k + 1.0)
mags = binomln(n - k, i) + betaln(delta * (i + k) + alpha, beta)
signs = (-tt.ones_like(i))**i # (-1.0)**i
return binomln(n, k) - betaln(alpha, beta) + logsumexp(mags, signs)
return bound(
theano.map(mapper,
sequences=[self.n, value],
non_sequences=[self.alpha, self.beta,
self.delta])[0], value >= 0,
value <= self.n, self.alpha > 0, self.beta > 0, self.delta > 0)
```

I create a simple PyMC model in `demo_gb1_pymc.py`

in the above Gist. It compiles and runs, albeit very very slowly, and produces nonsense results:

```
Running on PyMC3 v3.8
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [d, b, a]
Sampling 2 chains, 707 divergences: 100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 6000/6000 [2:19:32<00:00, 1.40s/draws]
There were 126 divergences after tuning. Increase `target_accept` or reparameterize.
The acceptance probability does not match the target. It is 0.5919737808282699, but should be close to 0.8. Try to increase the number of tuning steps.
There were 581 divergences after tuning. Increase `target_accept` or reparameterize.
The acceptance probability does not match the target. It is 0.31781568556128037, but should be close to 0.8. Try to increase the number of tuning steps.
The rhat statistic is larger than 1.4 for some parameters. The sampler did not converge.
The estimated number of effective samples is smaller than 200 for some parameters.
a b d
count 4000.000000 4000.000000 4000.000000
mean 9.818727 4.177204 0.818759
std 0.125664 0.016436 0.010090
min 9.575419 4.124418 0.798366
25% 9.695400 4.166789 0.807851
50% 9.835942 4.180275 0.821065
75% 9.936099 4.189759 0.827032
max 9.999970 4.226625 0.839932
```

I say βnonsenseβ because the samples for the three variables seem to have barely mixed and arenβt what I expect them to be (in the simulation, `alpha`

is 3.3, `beta`

is 4.4, and `delta`

is 0.333).

I realize this is a ton of code and math to take in, but I hope the parallels between `Gb1Binomial`

and what you already have in `BetaBinomial`

are straightforward and some kind soul can point out what Iβm doing wrong? Thank you!!!

(N.B. All code samples are on Gist: https://gist.github.com/fasiha/5bf34e1882fa09ea9b60f32dfd38540f)