How to limit a Geometric Discrete Distribution in a range

EduardoCabria · February 20, 2020, 10:23pm

Hi mates !
I’m a PyMC3 beginner, the last weeks I was looking at all your posts for my specific doubts but I found a trouble with that Distribution and I want to share with you. If you can help me, I would be grateful.

The issue is when I want to plot a normal Geometric Discrete Distribution:

(In a .py program)

   nombres_clientes = pm.Geometric('clients_names', p=0.02685)
   trace = pm.sample(10000, cores=1)
   pm.plot_posterior(trace)

My result is:

plot_clients_name

What I want is to limit the distribution between [1,100], I don’t want more data than 100. That is because I want only to pick values between that interval, to identify each number to a real name, and my database size is 100.

It could be maybe an stupid problem but I’ve been two hours trying to do it. I thought to use with logp function but I don’t know very well how to write in it.

Thankyou communiy.

nkaimcaudle · February 21, 2020, 1:24am

You can use pm.Bound() to achieve this

with pm.Model() as model:
    names1 = pm.Geometric('names1', p=0.02685)
    names2 = pm.Bound(pm.Geometric, upper=100)('names2', p=0.02685)
    trace = pm.sample(10000, cores=1)
pm.plot_posterior(trace);

junpenglao · February 21, 2020, 6:20am

Or a potential:

with pm.Model(): 
    g = pm.Geometric('g', p=.01) 
    pm.Potential('constrain', tt.switch((g<1) | (g>100), -np.inf, 0.)) 
    trace = pm.sample()

EduardoCabria · February 21, 2020, 1:25pm

Thankyou both ! I got my goal

EduardoCabria · March 2, 2020, 7:31pm

@junpenglao @nkaimcaudle Sorry for opening the post again, but if I want to limit a continuous distribution, again between 0 and 101, what should I use ?

EduardoCabria · March 2, 2020, 7:35pm

I did that:

import numpy as np
import pymc3 as pm
import theano.tensor as tt

with pm.Model() as nombre_Agapito:
edad = pm.Weibull("edad", alpha=2.2309, beta=0.02945)
dist = pm.Potential('dist', tt.switch((edad<1) | (edad>100),-np.inf, 0.))
trace1 = pm.sample(10000, cores=1)
pm.plot_trace(trace1)
pm.plot_posterior(trace1)

and I get the error:

SamplingError: Bad initial energy

nkaimcaudle · March 3, 2020, 12:16am

In theory the same two techniques will still work however here I think your starting parameters are causing the first sample from Weibull to be outside the 1-100 range. If I change to alpha=5. and beta=2. then it does (sometimes) work.

How are you getting your initial parameter inputs? They are very specific and not very well suited to cover the range 1-100

np.mean( pm.Weibull.dist(alpha=2.2309, beta=0.02945).random(size=10_000)>1 ) # equals 0.00

If I draw 10,000 random samples from the Weibull with those parameters then none of them are above 1.

AlexAndorra · March 10, 2020, 8:29pm

Hi there!
Just jumping in because I tried implementing @junpenglao’s solution with a Potential:

bike_count = pm.Geometric('bike_count', p, observed=bike_data["count"])
pm.Potential('constraint', tt.switch(bike_count > 1100, -np.inf, 0.))

This samples perfectly, but I’m not sure the constraint was applied, when looking at PPCs:

idata = az.from_pymc3(trace=trace_bike_3, prior=prior_samples, posterior_predictive=post_samples)
az.plot_ppc(idata);

This gives:

PPCs seem to still go way out of range compared to observations, don’t they?

AlexAndorra · March 30, 2020, 9:20am

Just reviving this post: do you think it’s a bug, or did I make mistake somewhere?

junpenglao · March 30, 2020, 12:41pm

Oh yeah seems like a bug - the posterior predict cannot account for the potential

junpenglao · March 30, 2020, 12:42pm

We should add a warning to posterior_preditive whenever we detect a model contains a potential, in most case it will not be correct.

AlexAndorra · March 30, 2020, 12:53pm

Thanks Junpeng! I’ll open an issue on GitHub then

Topic		Replies	Views
Probability of a certain value in a discrete distribution Questions	9	1200	February 27, 2020
Sp.stats and PyMC3 logps different Questions	2	510	December 9, 2021
Geometric distribution breaks when data contains 0 Questions	2	453	August 27, 2018
Distribution shape controlled by discrete random variable v5 modeling	2	387	July 26, 2022
Modelling: distribution conditioned on parameter being within interval Questions	1	411	February 18, 2019

How to limit a Geometric Discrete Distribution in a range

Related topics