# Gaussian process "locked" together

I’m trying to build a model (on synthetic data for now) based on Poisson observations with a latent Gaussian process as the parameter. I can get this model working nicely:

``````with pm.Model() as period_trend_model:
period_ls = pm.Exponential('period_ls', lam=1.0)
period_cov = pm.gp.cov.Periodic(1, period=50, ls=period_ls)

trend_ls = pm.Exponential('trend_ls', lam=0.1)

gp = pm.gp.Latent(cov_func=period_cov*trend_cov)

f = gp.prior("f", X=x[:, None])
lam = pm.Deterministic("lam", pm.math.exp(f))

y = pm.Poisson('y', mu=lam, observed=y_obs)
``````

With NUTS it infers the original function quite nicely.

When I try to extend this model to a “mixed effects” type model where there are multiple subjects, with individual multipliers, I run into problems. This is the model:

``````with pm.Model() as mixed_model:
period_ls = pm.Exponential('period_ls', lam=1.0)
period_cov = pm.gp.cov.Periodic(1, period=50, ls=period_ls)
trend_ls = pm.Exponential('trend_ls', lam=0.1)

gp = pm.gp.Latent(cov_func=period_cov*trend_cov)
f = gp.prior("f", X=X0[:, None])

base_lam = pm.Deterministic("base_lam", pm.math.exp(f))

alpha = pm.Exponential('alpha', lam=0.1)
beta = pm.Exponential('beta', lam=0.1)

individual_lam = pm.Gamma('individual_lam', alpha=alpha, beta=beta, shape=(1, n_individuals))

y = pm.Poisson('y', mu=T.dot(base_lam[:, None], individual_lam), observed=y_obs2.T)
``````

I find that with both NUTS and ADVI, the samples of the Gaussian process are only changed in “lock step”, yielding these posterior samples (the dark line is the original function): Can anyone suggest why this is happening? Thanks!

These are my samples from the first model, for reference: Since you are using synthetic data, would it possible to post how you generated it with the code? It would make the problem easier to diagnose.

Are those samples (the ones that don’t look right) posterior predictive samples of `y_obs2` (or log of y_obs2)? Or are they samples of `base_lam`? It’d be nice to look at whichever of those two aren’t shown.

My hunch is that the model is not identifiable because there is not enough data (observation). As a result, the uncertainty is mostly loaded to `individual_lam`. Check the estimation of the length-scale parameters and the estimation of `individual_lam`.
One of the solution is to use much more informative prior, for example a much narrow prior for `individual_lam`.

Thank you for your replies. Here is my notebook.

@bwengals @junpenglao The samples in the graphic above are the posterior samples of `base_lam`. I’ve now included posterior predictive samples and they look quite reasonable. I imagine this highlights unidentifiability? Since I’ve used a hierarchical Gamma prior for `individual_lam`, how would you suggest making this more informative?

Looking at the trace, the `period_ls` and `trend_ls` displaying some multimodality, it will definitively help to have more informative prior. Also, you should sample multiple chains (e.g., setting `chains=4`)

I’ve added more data and more informative priors (cheating a bit because I know the true values) at the bottoms of the notebook. It appears to sort out the multimodality but the GP samples are still sampled ‘in step’. I’m having issues with multiprocessing (setting `njobs=4` causes hanging) so I’ve stuck with a single chain.

I can’t understand how the PPCs are capturing the variation in rate over `x` because the GP `f` doesn’t and `individual_lam` is constant. Do you have any suggestions for what might be happening?

What `individual_lam` is doing is scaling the `base_lam`, which resulting in a zero mean `base_lam`. So although `individual_lam` just a constant for each subgroup it still scales the base to an appropriate height.

You can try below. Changing:

``````    alpha = BoundedNormal('alpha', mu=1.0, sd=0.25)
beta = BoundedNormal('beta', mu=1.0, sd=0.25)
individual_lam = pm.Gamma('individual_lam', alpha=alpha, beta=beta, shape=(1, n_individuals))
y = pm.Poisson('y', mu=T.dot(base_lam[:, None], individual_lam), observed=y_obs2.T)
``````

into something like:

``````    individual_lam = pm.Normal('individual_lam', 0, 10., shape=(1, n_individuals))
intercept = pm.Normal('c', 0., 10.)
y = pm.Poisson('y', mu=intercep + T.dot(base_lam[:, None], individual_lam), observed=y_obs2.T)
``````

You can try setting `njobs=1, chains=4`

Using the intercept in this model I achieved quite good results (which you can now see at the bottom of the notebook):

``````log_period_ls = pm.Normal('log_period_ls', mu=0.0, sd=0.5)
period_ls = pm.Deterministic('period_ls', pm.math.exp(log_period_ls))
period_cov = pm.gp.cov.Periodic(1, period=50, ls=period_ls)

log_trend_ls = pm.Normal('log_trend_ls', mu=4.0, sd=0.5)
trend_ls = pm.Exponential('trend_ls', pm.math.exp(log_trend_ls))

gp = pm.gp.Latent(cov_func=period_cov*trend_cov)
f = gp.prior("f", X=X0[:, None])

intercept = pm.Normal('intercept', mu=0.0, sd=2.0)

individual_var = pm.Exponential('individual_var', lam=0.1)
random_effect = pm.Normal('random_effect', mu=0.0, sd=individual_var, shape=(1, n_individuals))

base_lam = pm.Deterministic('base_lam', pm.math.exp(f + intercept))

y = pm.Poisson('y', mu=T.dot(base_lam[:, None], pm.math.exp(random_effect)), observed=y_obs2.T)
``````

Thank you so much for your help! The Gaussian process API is quite impressive.

Great! Glad it helps BTW, what is your machine set up? I was trying to run your notebook but it was much slower I’m running a on an EC2 `t2.large` instance, so nothing crazy powerful 1 Like