I have successfully (I believe!) created a CustomDist:
def binom_logp( value, n, p):
value = pt.as_tensor_variable(intX(value)) ## do I need these?
n = pt.as_tensor_variable(intX(n), dtype='int32') ## do I need these?
p = pt.as_tensor_variable(floatX(p)) ## do I need these?
res = pt.switch(
pt.or_(pt.lt(value, 0), pt.gt(value, n)),
-np.inf,
binomln(n, value) + logpow(p, value) + logpow(1 - p, n - value),
)
print(res)
return res
def binom_draw(n, p, size=None):
return np.random.Generator.binomial( n, p, size = size )
n = pm.CustomDist('n', N, p, size = len(obs_n), logp = binom_logp, random = binom_draw, observed = obs_n)
It is a simple binomial distribution. I know this is already in-built - I just want to check everything is working on this simple example before I modify CustomDist to something more interesting.
Unfortunately this model is being sampled by treating n as a continuous random variable (so, for instance, NUTS is being used to impute the missing/masked data in obs_n, rather than Metropolis).
Q: How can I ensure this CustomDist is properly treated as a discrete distribution?
A quick follow-up question is: what is the best way to demand that the values in a CustomDist are bounded? I don’t want the sampler to waste time searching in zero-likelihood regions, so I’d like to impose a lower and upper bound on:
n = pm.CustomDist(‘n’, N, p, size = len(obs_n), logp = binom_logp, random = binom_draw, observed = obs_n)
so that the imputed missing values are sampled efficiently. Many of the inbuilt distributions have a ‘lower’ and ‘upper’ keyword argument; does CustomDist have the same?
If your variable is observed there is no sampling going on so you don’t have to worry about it.
Otherwise you would need to implement your own step sampler that never proposes dumb values. You could probably subclass whatever step sampler is being used and tweak the proposed values (just make sure to keep detailed balance).