Calculating marginal likelihood of GP Model

rohanchauhan · December 29, 2018, 8:39pm

Hi, I am using the marginal likelihood implementation of PyMC3 GP. How can I calculate the marginal likelihood for model comparison? My understanding is that model.logp(y) would give me the log posterior.

A bit unrelated, I tried calculating MLE for the GP model using flat priors and pm.find_MAP() but it doesn’t work because of inf or Nan values. The initial idea was to calculate BIC but as that cannot be done I want to calculate marginal likelihood for model comparison.

I was seeing bridge sampling notebook @junpenglao but then read The Harmonic Mean of the Likelihood: Worst Monte Carlo Method Ever. I know bridge sampling wouldn’t work here. As far as BIC is concerned, BIC GPy, here by Neil comments that theoretically it’s invalid for GPs as they assume correlation among parameters. After a lot of reading, I have got confused.

Any suggestion to compute BIC or Marginal likelihood would be appreciated!

Best Regards,
Rohan

junpenglao · December 30, 2018, 2:08am

Neal is right. And in most cases, when you have a complex model, computing marginal likelihood is extremely difficult. I would not trust anything and advise you to choose other methods to do model comparison (loo or waic).

junpenglao · December 30, 2018, 2:10am

The nan in find_map() might related to the bad initial value, try changing the start kwarg of the find_map() call.

rohanchauhan · December 30, 2018, 9:02am

Hi,

Thanks for your comments on metrics.

Regarding the MLE:

lengthscale_true = 1
mean = pm.gp.mean.Zero()
cov = pm.gp.cov.ExpQuad(1,lengthscale_true)

n = 200
X_true = np.linspace(0,5,n)[:,None]
y_true = np.random.multivariate_normal(mean(X_true).eval(),cov(true).eval() + 1e-8*np.eye(n), 1).flatten()

noiseobs = 0.5
X_obs = X_true
y_obs = y_true + noiseobs * np.random.randn(n)

with pm.Model() as standardmodel:
noise_obs = pm.Flat(‘noise_obs’)
lengthscale = pm.Flat(‘lengthscale’)
covariance = pm.gp.cov.ExpQuad(1,lengthscale)
mean = pm.gp.mean.Zero()

gp_obs = pm.gp.Marginal(mean_func=mean, cov_func=covariance)

yobs = gp_obs.marginal_likelihood("yobs", X=X_obs, y=y_obs, noise=noise_obs)
start={'noise_obs': 0.5, 'lengthscale': 1}
mp = pm.find_MAP(start=start)

The error occurs on yobs line. The error is ValueError: array must not contain infs or NaNs. It can be reporoduced using the above code. I went through the trace but couldn’t find the problem. It might be happening due to Flat priors.

junpenglao · December 30, 2018, 4:25pm

The length scale and noise only takes positive value - you should use HalfFlat instead.

    noise_obs = pm.HalfFlat('noise_obs')
    lengthscale = pm.HalfFlat('lengthscale')

rohanchauhan · December 30, 2018, 5:14pm

So silly to miss that on my part. Sorry for asking so many questions!

Just to confirm the output of logp by MAP is logp of the model. So the optimizer is maximizing log the probablity of the model.

logp = -220.68, ||grad|| = 149.66: 100%|██████████| 11/11 [00:00<00:00, 97.39it/s]

mp

{'noise_obs_log__': array(-0.7523434),
 'lengthscale_log__': array(-0.14632332),
 'noise_obs': array(0.47126091),
 'lengthscale': array(0.86387835)}

So, how do we get the log MLE ?

yobs.logp(mp)

Thanks a lot!

junpenglao · December 31, 2018, 12:02am

You can do model.logp(mp) to get the model logp (up to a constant) at the point of MLE.

Topic		Replies	Views
Kernel hyper-parameters priors and their use in MAP estimates of Gaussian Processes Questions	6	1267	May 18, 2022
Maximum likelihood estimation of a Bayesian model Questions	11	5477	April 12, 2018
What makes difference find_MAP() and pm.sample() Questions	2	2358	October 17, 2021
Unconsistent logp, Evaluating the results of find_MAP on Custom Likelihood Questions	3	587	August 28, 2018
How do you formulate a `Latent` GP that is equivalent to the `Marginal` implementation? Questions	5	482	May 26, 2021

Calculating marginal likelihood of GP Model

Related topics