Trouble Using SMC sampler for Marginal Likelihood/Bayes Factor Computation

pwitkow · May 22, 2018, 6:50am

Hi,

I’m testing the Sequential Monte Carlo sampler given the example in the docs here

I tried doing this with a simple linear regression model like this:

#generate data
np.random.seed(99)
x=np.random.choice(np.arange(1, 15, 1),20 , replace=True)
x=np.array(sorted(x))
y=(3*x)+np.random.normal(loc=0.0, scale=2.0, size=len(x))

test_folder=mkdtemp(‘Test’)
nchains=1000

true model for the data

true_model=pm.Model()
with true_model:

true_int=pm.Normal('true_int', mu=0, sd=10)
sig_true_int=pm.HalfNormal('sig_true_int', sd=2)

true_b1=pm.Normal('true_b1', mu=0, sd=10)
sig_b_grp=pm.HalfNormal('sig_true_b1', sd=2)

#specify the simple linear model
true_mu=true_int+(true_b1*x)

true_eps = pm.HalfCauchy('eps', 5)

Y_obs = pm.Normal('Y_obs', mu=true_mu, sd=true_eps, observed=y)
true_trace = smc.sample_smc(samples=2000,
                           n_chains=nchains,
                           progressbar=True,
                           homepath=test_folder,
                           stage=0,
                           random_seed=42)

#null model
null_folder=mkdtemp(‘Test2’)
#null model for contrast
null_model=pm.Model()

#add in conrast factors for each level of the factors
with null_model:

null_int=pm.Normal('null_int', mu=np.mean(y), sd=10)
sig_null_int=pm.HalfNormal('sig_null_int', sd=5)

#specify the simple linear model
null_mu=null_int

null_eps=pm.HalfCauchy('eps', 5)

Y_obs = pm.Normal('Y_obs', mu=null_mu, sd=null_eps, observed=y)

null_trace = smc.sample_smc(samples=2000,
                           n_chains=nchains,
                           progressbar=True,
                           homepath=null_folder,
                           stage=0,
                           random_seed=42)

Then I compute the Bayes Factor just like in the documentation:

BF=true_model.marginal_likelihood/null_model.marginal_likelihood

However, this code gives me the wrong answer, in fact it returns strong evidence for the null. Can someone please help find out whats wrong with this code, or suggest another way to get the Bayes Factor for each model?

Attached is a copy of the script.regression_example.py (1.9 KB)

Thanks!

junpenglao · May 22, 2018, 9:10am

I suspected that the marginal likelihood computed from the SMC is not correct (see https://junpenglao.xyz/Blogs/posts/2017-11-22-Marginal_likelihood_in_PyMC3.html#Computing-marginal-likelihood-using-SMC for another example). However, I have not yet have time to diagnose and validate the implementation.
cc @aloctavodia. Maybe we should take down the doc before we can make sure the implementation is correct?

aloctavodia · May 22, 2018, 12:08pm

Hi @pwitkow sorry for the inconveniences. Even when I was able to reproduce values of BFs for some models it seems that there is a problem with the current marginal likelihood computations implemented in SMC. BTW, notice that there are some variables in your models like sig_true_int that are not really part of the model as they are not contented to the data or to other priors, you should remove them to compute Bayes Factors.

@junpenglao notice that the marginal likelihoods computed by SMC are only proportional to the correct marginal likelihood (so they only makes sense as BFs), anyway you are totally right and we should remove BFs computed using SMC from the docs until we are sure the implementation is correct. I discussed this problem with @agustinaarroyuelo and we are going to tackle it during GSoC.

pwitkow · May 23, 2018, 12:06am

Hi, thanks for such a quick response! @aloctavodia Thanks for pointing that out about the model too. Its a hold over from when I was playing around with multilevel models.

Topic		Replies	Views
Sequential monte carlo - need help? [SOLVED] Questions	10	887	January 8, 2018
Approximate Bayesian Computation example bug? Questions	7	761	September 1, 2020
Pm.step_methods() doesn't include pm.SMC() in pymc3 v 3.8 Questions	6	793	April 17, 2020
Bayes Factor using NUTS sampler Questions	4	1141	January 8, 2021
SMC Sampler "name 'HTML' is not defined" Error Questions	3	782	February 22, 2023

Trouble Using SMC sampler for Marginal Likelihood/Bayes Factor Computation

true model for the data

Related topics