Making a discrete CustomDist

RCHA · June 12, 2023, 10:18pm

Hi, I’m using pymc5.5.0.

I have successfully (I believe!) created a CustomDist:

def binom_logp( value, n, p):
        value = pt.as_tensor_variable(intX(value)) ## do I need these?
        n = pt.as_tensor_variable(intX(n), dtype='int32') ## do I need these?
        p = pt.as_tensor_variable(floatX(p)) ## do I need these?
        res = pt.switch(
                pt.or_(pt.lt(value, 0), pt.gt(value, n)),
                -np.inf,
                binomln(n, value) + logpow(p, value) + logpow(1 - p, n - value),
                )
        print(res)
        return res

def binom_draw(n, p, size=None):
        return np.random.Generator.binomial( n, p, size = size )

n = pm.CustomDist('n', N, p, size = len(obs_n), logp = binom_logp, random = binom_draw, observed = obs_n)

It is a simple binomial distribution. I know this is already in-built - I just want to check everything is working on this simple example before I modify CustomDist to something more interesting.

Unfortunately this model is being sampled by treating n as a continuous random variable (so, for instance, NUTS is being used to impute the missing/masked data in obs_n, rather than Metropolis).

Q: How can I ensure this CustomDist is properly treated as a discrete distribution?

Thanks!

cluhmann · June 13, 2023, 1:54am

Can we see the full model (the one you are sampling)?

ricardoV94 · June 13, 2023, 6:22am

You can pass dtype="int64" to your CustomDist to make it discrete

RCHA · June 13, 2023, 8:27am

Thanks - @ricardoV94’s solution worked a treat.

A quick follow-up question is: what is the best way to demand that the values in a CustomDist are bounded? I don’t want the sampler to waste time searching in zero-likelihood regions, so I’d like to impose a lower and upper bound on:

n = pm.CustomDist(‘n’, N, p, size = len(obs_n), logp = binom_logp, random = binom_draw, observed = obs_n)

so that the imputed missing values are sampled efficiently. Many of the inbuilt distributions have a ‘lower’ and ‘upper’ keyword argument; does CustomDist have the same?

Thanks!

ricardoV94 · June 13, 2023, 9:33am

If your variable is observed there is no sampling going on so you don’t have to worry about it.

Otherwise you would need to implement your own step sampler that never proposes dumb values. You could probably subclass whatever step sampler is being used and tweak the proposed values (just make sure to keep detailed balance).

Otherwise there was a draft sometime ago to allow discrete variable transforms, so the samplers don’t need to know about these things: Implement transforms for discrete variables by ricardoV94 · Pull Request #6102 · pymc-devs/pymc · GitHub

If you think that functionality would be useful for you, feel free to chime in on the draft

Topic		Replies	Views
Discrete DensityDist v5 modeling	5	505	January 7, 2023
Drawing values from custom distribution Questions	8	990	September 14, 2022
Dimensionality problem with CustomDist v5	0	36	April 23, 2025
Using CustomDist for sample_prior_predictive v5 prior , modeling , pytensor , diagnostics	9	482	July 12, 2023
Custom Distribution with pm.CustomDist v5 modeling	4	1734	June 27, 2023

Making a discrete CustomDist

Related topics