Hi mates !
I’m a PyMC3 beginner, the last weeks I was looking at all your posts for my specific doubts but I found a trouble with that Distribution and I want to share with you. If you can help me, I would be grateful.
The issue is when I want to plot a normal Geometric Discrete Distribution:
(In a .py program)
nombres_clientes = pm.Geometric('clients_names', p=0.02685)
trace = pm.sample(10000, cores=1)
pm.plot_posterior(trace)
My result is:
What I want is to limit the distribution between [1,100], I don’t want more data than 100. That is because I want only to pick values between that interval, to identify each number to a real name, and my database size is 100.
It could be maybe an stupid problem but I’ve been two hours trying to do it. I thought to use with logp function but I don’t know very well how to write in it.
Thankyou communiy.
You can use pm.Bound() to achieve this
with pm.Model() as model:
names1 = pm.Geometric('names1', p=0.02685)
names2 = pm.Bound(pm.Geometric, upper=100)('names2', p=0.02685)
trace = pm.sample(10000, cores=1)
pm.plot_posterior(trace);
3 Likes
Or a potential:
with pm.Model():
g = pm.Geometric('g', p=.01)
pm.Potential('constrain', tt.switch((g<1) | (g>100), -np.inf, 0.))
trace = pm.sample()
3 Likes
Thankyou both ! I got my goal
@junpenglao @nkaimcaudle Sorry for opening the post again, but if I want to limit a continuous distribution, again between 0 and 101, what should I use ?
I did that:
import numpy as np
import pymc3 as pm
import theano.tensor as tt
with pm.Model() as nombre_Agapito:
edad = pm.Weibull("edad", alpha=2.2309, beta=0.02945)
dist = pm.Potential('dist', tt.switch((edad<1) | (edad>100),-np.inf, 0.))
trace1 = pm.sample(10000, cores=1)
pm.plot_trace(trace1)
pm.plot_posterior(trace1)
and I get the error:
SamplingError: Bad initial energy
In theory the same two techniques will still work however here I think your starting parameters are causing the first sample from Weibull to be outside the 1-100 range. If I change to alpha=5. and beta=2. then it does (sometimes) work.
How are you getting your initial parameter inputs? They are very specific and not very well suited to cover the range 1-100
np.mean( pm.Weibull.dist(alpha=2.2309, beta=0.02945).random(size=10_000)>1 ) # equals 0.00
If I draw 10,000 random samples from the Weibull with those parameters then none of them are above 1.
1 Like
Hi there!
Just jumping in because I tried implementing @junpenglao’s solution with a Potential:
bike_count = pm.Geometric('bike_count', p, observed=bike_data["count"])
pm.Potential('constraint', tt.switch(bike_count > 1100, -np.inf, 0.))
This samples perfectly, but I’m not sure the constraint was applied, when looking at PPCs:
idata = az.from_pymc3(trace=trace_bike_3, prior=prior_samples, posterior_predictive=post_samples)
az.plot_ppc(idata);
This gives:
PPCs seem to still go way out of range compared to observations, don’t they?
Just reviving this post: do you think it’s a bug, or did I make mistake somewhere?
Oh yeah seems like a bug - the posterior predict cannot account for the potential
We should add a warning to posterior_preditive whenever we detect a model contains a potential, in most case it will not be correct.
Thanks Junpeng! I’ll open an issue on GitHub then
1 Like