Gaussian process "locked" together


#1

I’m trying to build a model (on synthetic data for now) based on Poisson observations with a latent Gaussian process as the parameter. I can get this model working nicely:

with pm.Model() as period_trend_model:    
period_ls = pm.Exponential('period_ls', lam=1.0)
period_cov = pm.gp.cov.Periodic(1, period=50, ls=period_ls)

trend_ls = pm.Exponential('trend_ls', lam=0.1)
trend_cov = pm.gp.cov.ExpQuad(1, ls=trend_ls)

gp = pm.gp.Latent(cov_func=period_cov*trend_cov)

f = gp.prior("f", X=x[:, None])
lam = pm.Deterministic("lam", pm.math.exp(f))

y = pm.Poisson('y', mu=lam, observed=y_obs)

With NUTS it infers the original function quite nicely.

When I try to extend this model to a “mixed effects” type model where there are multiple subjects, with individual multipliers, I run into problems. This is the model:

with pm.Model() as mixed_model:  
period_ls = pm.Exponential('period_ls', lam=1.0)
period_cov = pm.gp.cov.Periodic(1, period=50, ls=period_ls)
trend_ls = pm.Exponential('trend_ls', lam=0.1)
trend_cov = pm.gp.cov.ExpQuad(1, ls=trend_ls)

gp = pm.gp.Latent(cov_func=period_cov*trend_cov)
f = gp.prior("f", X=X0[:, None])

base_lam = pm.Deterministic("base_lam", pm.math.exp(f))

alpha = pm.Exponential('alpha', lam=0.1)
beta = pm.Exponential('beta', lam=0.1)

individual_lam = pm.Gamma('individual_lam', alpha=alpha, beta=beta, shape=(1, n_individuals))

y = pm.Poisson('y', mu=T.dot(base_lam[:, None], individual_lam), observed=y_obs2.T)

I find that with both NUTS and ADVI, the samples of the Gaussian process are only changed in “lock step”, yielding these posterior samples (the dark line is the original function):

fail_gp

Can anyone suggest why this is happening? Thanks!


#2

These are my samples from the first model, for reference:
success_gp


#3

Since you are using synthetic data, would it possible to post how you generated it with the code? It would make the problem easier to diagnose.

Are those samples (the ones that don’t look right) posterior predictive samples of y_obs2 (or log of y_obs2)? Or are they samples of base_lam? It’d be nice to look at whichever of those two aren’t shown.


#4

My hunch is that the model is not identifiable because there is not enough data (observation). As a result, the uncertainty is mostly loaded to individual_lam. Check the estimation of the length-scale parameters and the estimation of individual_lam.
One of the solution is to use much more informative prior, for example a much narrow prior for individual_lam.


#5

Thank you for your replies. Here is my notebook.

@bwengals @junpenglao The samples in the graphic above are the posterior samples of base_lam. I’ve now included posterior predictive samples and they look quite reasonable. I imagine this highlights unidentifiability? Since I’ve used a hierarchical Gamma prior for individual_lam, how would you suggest making this more informative?


#6

Looking at the trace, the period_ls and trend_ls displaying some multimodality, it will definitively help to have more informative prior. Also, you should sample multiple chains (e.g., setting chains=4)


#7

I’ve added more data and more informative priors (cheating a bit because I know the true values) at the bottoms of the notebook. It appears to sort out the multimodality but the GP samples are still sampled ‘in step’. I’m having issues with multiprocessing (setting njobs=4 causes hanging) so I’ve stuck with a single chain.

I can’t understand how the PPCs are capturing the variation in rate over x because the GP f doesn’t and individual_lam is constant. Do you have any suggestions for what might be happening?


#8

What individual_lam is doing is scaling the base_lam, which resulting in a zero mean base_lam. So although individual_lam just a constant for each subgroup it still scales the base to an appropriate height.

You can try below. Changing:

    alpha = BoundedNormal('alpha', mu=1.0, sd=0.25)
    beta = BoundedNormal('beta', mu=1.0, sd=0.25)
    individual_lam = pm.Gamma('individual_lam', alpha=alpha, beta=beta, shape=(1, n_individuals))
    y = pm.Poisson('y', mu=T.dot(base_lam[:, None], individual_lam), observed=y_obs2.T)

into something like:

    individual_lam = pm.Normal('individual_lam', 0, 10., shape=(1, n_individuals))
    intercept = pm.Normal('c', 0., 10.)
    y = pm.Poisson('y', mu=intercep + T.dot(base_lam[:, None], individual_lam), observed=y_obs2.T)

You can try setting njobs=1, chains=4


#9

Using the intercept in this model I achieved quite good results (which you can now see at the bottom of the notebook):

log_period_ls = pm.Normal('log_period_ls', mu=0.0, sd=0.5)
period_ls = pm.Deterministic('period_ls', pm.math.exp(log_period_ls))
period_cov = pm.gp.cov.Periodic(1, period=50, ls=period_ls)

log_trend_ls = pm.Normal('log_trend_ls', mu=4.0, sd=0.5)
trend_ls = pm.Exponential('trend_ls', pm.math.exp(log_trend_ls))
trend_cov = pm.gp.cov.ExpQuad(1, ls=trend_ls)

gp = pm.gp.Latent(cov_func=period_cov*trend_cov)
f = gp.prior("f", X=X0[:, None])

intercept = pm.Normal('intercept', mu=0.0, sd=2.0)

individual_var = pm.Exponential('individual_var', lam=0.1)
random_effect = pm.Normal('random_effect', mu=0.0, sd=individual_var, shape=(1, n_individuals))

base_lam = pm.Deterministic('base_lam', pm.math.exp(f + intercept))

y = pm.Poisson('y', mu=T.dot(base_lam[:, None], pm.math.exp(random_effect)), observed=y_obs2.T)

Thank you so much for your help! The Gaussian process API is quite impressive.


#10

Great! Glad it helps :wink:
BTW, what is your machine set up? I was trying to run your notebook but it was much slower :sweat_smile:


#11

I’m running a on an EC2 t2.large instance, so nothing crazy powerful :smiley: