Gaussian process "locked" together

christiaanjs · January 12, 2018, 12:58am

I’m trying to build a model (on synthetic data for now) based on Poisson observations with a latent Gaussian process as the parameter. I can get this model working nicely:

with pm.Model() as period_trend_model:    
period_ls = pm.Exponential('period_ls', lam=1.0)
period_cov = pm.gp.cov.Periodic(1, period=50, ls=period_ls)

trend_ls = pm.Exponential('trend_ls', lam=0.1)
trend_cov = pm.gp.cov.ExpQuad(1, ls=trend_ls)

gp = pm.gp.Latent(cov_func=period_cov*trend_cov)

f = gp.prior("f", X=x[:, None])
lam = pm.Deterministic("lam", pm.math.exp(f))

y = pm.Poisson('y', mu=lam, observed=y_obs)

With NUTS it infers the original function quite nicely.

When I try to extend this model to a “mixed effects” type model where there are multiple subjects, with individual multipliers, I run into problems. This is the model:

with pm.Model() as mixed_model:  
period_ls = pm.Exponential('period_ls', lam=1.0)
period_cov = pm.gp.cov.Periodic(1, period=50, ls=period_ls)
trend_ls = pm.Exponential('trend_ls', lam=0.1)
trend_cov = pm.gp.cov.ExpQuad(1, ls=trend_ls)

gp = pm.gp.Latent(cov_func=period_cov*trend_cov)
f = gp.prior("f", X=X0[:, None])

base_lam = pm.Deterministic("base_lam", pm.math.exp(f))

alpha = pm.Exponential('alpha', lam=0.1)
beta = pm.Exponential('beta', lam=0.1)

individual_lam = pm.Gamma('individual_lam', alpha=alpha, beta=beta, shape=(1, n_individuals))

y = pm.Poisson('y', mu=T.dot(base_lam[:, None], individual_lam), observed=y_obs2.T)

I find that with both NUTS and ADVI, the samples of the Gaussian process are only changed in “lock step”, yielding these posterior samples (the dark line is the original function):

fail_gp

Can anyone suggest why this is happening? Thanks!

christiaanjs · January 12, 2018, 12:59am

These are my samples from the first model, for reference:
success_gp

bwengals · January 12, 2018, 4:44am

Since you are using synthetic data, would it possible to post how you generated it with the code? It would make the problem easier to diagnose.

Are those samples (the ones that don’t look right) posterior predictive samples of y_obs2 (or log of y_obs2)? Or are they samples of base_lam? It’d be nice to look at whichever of those two aren’t shown.

junpenglao · January 12, 2018, 10:17am

My hunch is that the model is not identifiable because there is not enough data (observation). As a result, the uncertainty is mostly loaded to individual_lam. Check the estimation of the length-scale parameters and the estimation of individual_lam.
One of the solution is to use much more informative prior, for example a much narrow prior for individual_lam.

christiaanjs · January 14, 2018, 8:07pm

Thank you for your replies. Here is my notebook.

@bwengals @junpenglao The samples in the graphic above are the posterior samples of base_lam. I’ve now included posterior predictive samples and they look quite reasonable. I imagine this highlights unidentifiability? Since I’ve used a hierarchical Gamma prior for individual_lam, how would you suggest making this more informative?

junpenglao · January 14, 2018, 8:18pm

Looking at the trace, the period_ls and trend_ls displaying some multimodality, it will definitively help to have more informative prior. Also, you should sample multiple chains (e.g., setting chains=4)

christiaanjs · January 15, 2018, 1:15am

I’ve added more data and more informative priors (cheating a bit because I know the true values) at the bottoms of the notebook. It appears to sort out the multimodality but the GP samples are still sampled ‘in step’. I’m having issues with multiprocessing (setting njobs=4 causes hanging) so I’ve stuck with a single chain.

I can’t understand how the PPCs are capturing the variation in rate over x because the GP f doesn’t and individual_lam is constant. Do you have any suggestions for what might be happening?

junpenglao · January 15, 2018, 6:40am

What individual_lam is doing is scaling the base_lam, which resulting in a zero mean base_lam. So although individual_lam just a constant for each subgroup it still scales the base to an appropriate height.

You can try below. Changing:

    alpha = BoundedNormal('alpha', mu=1.0, sd=0.25)
    beta = BoundedNormal('beta', mu=1.0, sd=0.25)
    individual_lam = pm.Gamma('individual_lam', alpha=alpha, beta=beta, shape=(1, n_individuals))
    y = pm.Poisson('y', mu=T.dot(base_lam[:, None], individual_lam), observed=y_obs2.T)

into something like:

    individual_lam = pm.Normal('individual_lam', 0, 10., shape=(1, n_individuals))
    intercept = pm.Normal('c', 0., 10.)
    y = pm.Poisson('y', mu=intercep + T.dot(base_lam[:, None], individual_lam), observed=y_obs2.T)

You can try setting njobs=1, chains=4

christiaanjs · January 16, 2018, 1:03am

Using the intercept in this model I achieved quite good results (which you can now see at the bottom of the notebook):

log_period_ls = pm.Normal('log_period_ls', mu=0.0, sd=0.5)
period_ls = pm.Deterministic('period_ls', pm.math.exp(log_period_ls))
period_cov = pm.gp.cov.Periodic(1, period=50, ls=period_ls)

log_trend_ls = pm.Normal('log_trend_ls', mu=4.0, sd=0.5)
trend_ls = pm.Exponential('trend_ls', pm.math.exp(log_trend_ls))
trend_cov = pm.gp.cov.ExpQuad(1, ls=trend_ls)

gp = pm.gp.Latent(cov_func=period_cov*trend_cov)
f = gp.prior("f", X=X0[:, None])

intercept = pm.Normal('intercept', mu=0.0, sd=2.0)

individual_var = pm.Exponential('individual_var', lam=0.1)
random_effect = pm.Normal('random_effect', mu=0.0, sd=individual_var, shape=(1, n_individuals))

base_lam = pm.Deterministic('base_lam', pm.math.exp(f + intercept))

y = pm.Poisson('y', mu=T.dot(base_lam[:, None], pm.math.exp(random_effect)), observed=y_obs2.T)

Thank you so much for your help! The Gaussian process API is quite impressive.

junpenglao · January 16, 2018, 5:08am

Great! Glad it helps
BTW, what is your machine set up? I was trying to run your notebook but it was much slower

christiaanjs · January 16, 2018, 7:26pm

I’m running a on an EC2 t2.large instance, so nothing crazy powerful

Topic		Replies	Views
Latent gaussian process prediction problem Questions gaussian_process	3	1036	January 25, 2022
Avoiding looping when using GP prior on latent variables version agnostic gaussian_process	8	594	March 30, 2022
Combine latent with observable variables for Latent Gaussian Processes version agnostic gaussian_process	1	37	October 21, 2024
Hierarchical Gaussian Process Questions	1	640	February 19, 2021
pm.gp.mean.Linear parameterisation Questions	1	638	June 3, 2018

Gaussian process "locked" together

Related topics