Making a discrete CustomDist

Hi, I’m using pymc5.5.0.

I have successfully (I believe!) created a CustomDist:

def binom_logp( value, n, p):
        value = pt.as_tensor_variable(intX(value)) ## do I need these?
        n = pt.as_tensor_variable(intX(n), dtype='int32') ## do I need these?
        p = pt.as_tensor_variable(floatX(p)) ## do I need these?
        res = pt.switch(
                pt.or_(pt.lt(value, 0), pt.gt(value, n)),
                -np.inf,
                binomln(n, value) + logpow(p, value) + logpow(1 - p, n - value),
                )
        print(res)
        return res

def binom_draw(n, p, size=None):
        return np.random.Generator.binomial( n, p, size = size )

n = pm.CustomDist('n', N, p, size = len(obs_n), logp = binom_logp, random = binom_draw, observed = obs_n)

It is a simple binomial distribution. I know this is already in-built - I just want to check everything is working on this simple example before I modify CustomDist to something more interesting.

Unfortunately this model is being sampled by treating n as a continuous random variable (so, for instance, NUTS is being used to impute the missing/masked data in obs_n, rather than Metropolis).

Q: How can I ensure this CustomDist is properly treated as a discrete distribution?

Thanks!

Can we see the full model (the one you are sampling)?

You can pass dtype="int64" to your CustomDist to make it discrete

Thanks - @ricardoV94’s solution worked a treat.

A quick follow-up question is: what is the best way to demand that the values in a CustomDist are bounded? I don’t want the sampler to waste time searching in zero-likelihood regions, so I’d like to impose a lower and upper bound on:

n = pm.CustomDist(‘n’, N, p, size = len(obs_n), logp = binom_logp, random = binom_draw, observed = obs_n)

so that the imputed missing values are sampled efficiently. Many of the inbuilt distributions have a ‘lower’ and ‘upper’ keyword argument; does CustomDist have the same?

Thanks!

If your variable is observed there is no sampling going on so you don’t have to worry about it.

Otherwise you would need to implement your own step sampler that never proposes dumb values. You could probably subclass whatever step sampler is being used and tweak the proposed values (just make sure to keep detailed balance).

Otherwise there was a draft sometime ago to allow discrete variable transforms, so the samplers don’t need to know about these things: Implement transforms for discrete variables by ricardoV94 · Pull Request #6102 · pymc-devs/pymc · GitHub

If you think that functionality would be useful for you, feel free to chime in on the draft

1 Like