Help with pm.Potential

sspickle · January 11, 2021, 10:07am

I’m trying to use PyMC3 to work out an example (exercise 3.3, p 47) from David MacKay’s great book " Information Theory, Inference, and Learning Algorithms" .

I was able to use a crude (manual) grid search approach and it worked OK. I was hoping to learn more about PyMC3 by using it for the same purpose. Here’s what I tried:

import numpy as np
import pymc3 as pm
import arviz as az
import theano.tensor as tt

print(f"Running on PyMC3 v{pm.__version__}")

np.random.seed(451)
x = np.random.exponential(3,size=500)

minx=1
maxx=20

obs = x[np.where(~((x<minx) | (x>maxx)))] # remove values outside range

with pm.Model() as expModel:
    λ = pm.Exponential("λ", lam=5)
    x = pm.Exponential('x', lam=1/λ, observed=obs)
    clip_exp = pm.Potential("clip_exp", tt.switch(((x<minx)|(x>maxx)),-np.inf,0))
    trace= pm.sample(2000, tune=1000, return_inferencedata=True)

az.summary(trace)

Sadly it doesn’t appear to have worked:

    mean     sd  hdi_3%  hdi_97%  mcse_mean  mcse_sd  ess_mean  ess_sd 
λ  3.683  0.188   3.333    4.051      0.003    0.002    3268.0  3268.0

Any suggestions would be appreciated!

thank you,
-Steve

ricardoV94 · January 11, 2021, 6:16pm

I think you want to use a bound on lambda (or 1/lambda)?

x is not a random variable since you observed it, so the pm.Potential is not doing anything (x will always equal obs, and the clip_exp will always return 0).

sspickle · January 11, 2021, 10:37pm

Ah, OK. Thank you for that insight. The idea is that I’m observing x, which is exponentially distributed, but I’m only able to observe values of x between minx, and maxx. Values outside this range are not available to me. In my model \lambda has the same units as x, so I need to specific lam (which has units of 1/x) as the reciprocal of \lambda. I looked into pm.Bound, but my understanding is that it cannot be used for observed values. Is there a simple way to describe observed values that are censored in this way?

thanks,
-Steve

ricardoV94 · January 12, 2021, 6:02am

According to Wikipedia it seems you need to add a correction term equal to cdf(max) - cdf(min). My guess would be a potential like this might do the job:


λ = pm.Exponential("λ", lam=5)
x = pm.Exponential('x', lam=1/λ, observed=obs)    

exp_dist = pm.Exponential.dist(lam=1/λ). # this is not part of the model, just used to get the logcdf
clip_exp = pm.Potential("clip_exp", tt.log(tt.exp(exp_dist.logcdf(maxx)) - tt.exp(exp_dist.logcdf(minx))))

sspickle · January 12, 2021, 9:34am

Ah, I see. It’s not independent of \lambda, just x. This didn’t seem to work, but I’ll poke around a bit more. Thanks for the help.

ricardoV94 · January 12, 2021, 9:56am

I wrongly used the argument mu in exp_dist instead of lam in my code snippet above (changed it now). Maybe that’s why it was not working?

ricardoV94 · January 12, 2021, 10:04am

The other thing is whether the value in the Potential should be negated (I never remember this).

This STAN page may be helpful to understand the general approach: 7.4 Sampling Statements | Stan Reference Manual

sspickle · January 12, 2021, 10:40am

Thanks. Here’s what I’ve got right now:

import numpy as np
import pymc3 as pm
import arviz as az
import theano.tensor as tt

print(f"Running on PyMC3 v{pm.__version__}")

np.random.seed(451)
x = np.random.exponential(3,size=500)

minx=1
maxx=20

obs = x[np.where(~((x<minx) | (x>maxx)))] # remove values outside range

with pm.Model() as expModel:
    λ = pm.Exponential("λ", lam=1/5)  # prior exponential with mean of 5
    x = pm.Exponential('x', lam=1/λ, observed=obs) # obs exponential with mean of $\lambda$.
    
    exp_dist = pm.Exponential.dist(lam=1/λ) # this is not part of the model, just used to get the logcdf
    clip_exp = pm.Potential("clip_exp", -pm.math.logdiffexp(exp_dist.logcdf(maxx), exp_dist.logcdf(minx)))

    trace= pm.sample(2000, tune=1000, return_inferencedata=True)

az.summary(trace)

Still getting:

    mean     sd  hdi_3%  hdi_97%  mcse_mean  mcse_sd  ess_mean  ess_sd 
λ  3.874  0.206   3.484    4.257      0.004    0.003    3279.0  3279.0

I’m expecting \lambda to be around 3.

The -pm.math... doesn’t seem to matter (+ or -, makes no difference). Not sure how to approach debugging this. Any ideas?

ricardoV94 · January 12, 2021, 10:58am

Maybe the Potential needs to be multiplied by the number of observations (I am running out of ideas here )?

Edit: There is one example with censored data in this Notebook (resources/DataAnalysis.ipynb at master · pymc-devs/resources · GitHub), section 5.5. It strengthens my conviction that the the correction term has to be proportional to the number of observations.

clip_exp = pm.Potential("clip_exp", pm.math.logdiffexp(exp_dist.logcdf(maxx), exp_dist.logcdf(minx)) * x.size)

Edit: It seems to work be working in this Colab NoteBook (?)

sspickle · January 13, 2021, 2:13am

Oh, wow. I never would’ve guessed that! Thank you. I’ll give that a try.

sspickle · January 13, 2021, 2:16am

I think that worked! Here’s what I have now:

import numpy as np
import pymc3 as pm
import arviz as az
import theano.tensor as tt

print(f"Running on PyMC3 v{pm.__version__}")

np.random.seed(451)
x = np.random.exponential(3,size=500)

minx=1
maxx=20

obs = x[np.where(~((x<minx) | (x>maxx)))] # remove values outside range

def censor_correct(λ):
    result = -tt.log(tt.exp(-minx/λ)- tt.exp(-maxx/λ))*len(obs)
    return result
    
with pm.Model() as expModel:
    λ = pm.Exponential("λ", lam=1/5)  # prior exponential with mean of 5
    clip_exp = pm.Potential("clip_exp", censor_correct(λ))
    x = pm.Exponential('x', lam=1/λ, observed=obs) # obs exponential with mean of $\lambda$.
    
    trace= pm.sample(2000, tune=1000, return_inferencedata=True)

az.summary(trace)

	mean	sd	    hdi_3%	hdi_97%	
λ	2.895	0.159	2.603	3.199

Thank you!

Topic		Replies	Views
Peculiar Issue with PyMC3.Potential Questions	7	671	July 30, 2020
Trying to use pymc3 for predictive MC-Simulation [SOLVED] Questions	1	499	July 18, 2018
PyMC3-Sampling not taking place for the entire range of values between the given bounds of a Uniform prior Questions	7	1578	January 14, 2018
Parameter Learning in Simple Bayesian Network Questions	5	2797	April 2, 2018
PyMC3 for physics Development	6	1541	August 22, 2017

Help with pm.Potential

Related topics